Humanloop

Syncs AI session logs and user feedback into your product roadmap and automates the creation of evaluation experiments to refine prompt performance.

Try Humanloop in Ceven

Ask Ceven anything
Standard

Why use Ceven?

  1. AI native Humanloop integration

    • Describe the outcome and Ceven picks the right Humanloop calls, fills the parameters, and checks the result.
    • Structured, agent friendly tool schemas so each call runs reliably instead of by guesswork.
    • Rich coverage for reading, writing, and querying your Humanloop data, across all 4 of its actions.
  2. Managed auth

    • Built in OAuth with automatic token refresh and rotation.
    • One place to manage, scope, and revoke Humanloop access.
    • Per user and per environment credentials instead of shared keys.
  3. Agent optimized design

    • Actions are tuned from real success and error rates so reliability climbs over time.
    • Full execution logs so you always know what ran in Humanloop, when, and on whose behalf.
    • The agent pauses and asks when Humanloop is unclear instead of plowing ahead.
  4. Enterprise grade security

    • Fine grained access so you control which agents and people can reach Humanloop.
    • Least privilege by default, read scopes first and only the writes a workflow needs.
    • A full audit trail of every Humanloop action to support review and sign off.

Supported tools

Every action Ceven's agents can run on Humanloop, and when to use it.

Create project
Use this when you need to spin up a new isolated environment for a specific AI feature or a new model test.
Delete project
Permanently remove a project and all its associated sessions and evaluations. Use this for cleaning up old experiments.
List experiments
Pull all experiments for a project to compare prompt versions and check which iteration has the highest score.
List sessions
Retrieve a paginated list of user interactions. Use this to find specific traces for debugging or feedback analysis.
Get session details
Pull the full input and output trace for a single session ID to analyze exactly where a model failed.
Create evaluation
Submit a score or label for a specific session. Use this to programmatically mark a response as correct or incorrect.
Update prompt
Push a new prompt version to a project. Use this when a workflow identifies a better prompt via an experiment.
Search sessions
Query sessions by metadata or text content to find common failure patterns across your user base.
List projects
Pull a list of all active projects in the organization to map them to internal product modules.
Create datapoint
Add a specific input output pair to a dataset for future gold set testing and benchmarking.
Get experiment
Pull detailed metrics and results for a specific experiment ID to determine the winning prompt.
Archive project
Move a project out of the active view without deleting the data. Use this for seasonal AI campaigns.

12 actions · scroll to see them all

Frequently asked questions

Ceven connects to Humanloop using a secure API key provided from your project settings. When you add your key, we encrypt it using AES 256 and store it in a secure vault. The agent only retrieves this key at the moment it needs to make a request to the Humanloop API. We never share this key with the LLM itself; the model only sees the tool definitions and the resulting data. You can rotate your API key in the Humanloop dashboard at any time, and simply updating it in Ceven will restore the connection immediately without breaking your existing workflows.
Yes. You can build a workflow where Ceven monitors your sessions for a specific trigger, such as a low rating. When that happens, the agent pulls the session trace, sends it to a prompt optimizer, and then uses the Create experiment action in Humanloop to test the new version. It can then wait for a set number of evaluations to come in and automatically promote the winning prompt to production. This creates a self healing loop where your AI improves based on real world usage without requiring a developer to manually copy and paste prompts between the console and the code.
One specific quirk of the Humanloop API is that session listing is heavily paginated. If you have millions of logs, a single request will not return everything. Ceven handles this by automatically walking the pagination cursors in the background. However, be aware that very large data pulls can hit Humanloop rate limits on the standard tier. If you notice timeouts during massive backfills, we recommend filtering your session searches by a tighter date range or using specific metadata tags to reduce the payload size and avoid hitting the API throttle.
Ceven works with any project that has an active API key. Whether you are using Humanloop for simple prompt versioning or complex model evaluations, the agent can interact with the project. The only restriction is based on your Humanloop permission level. If your API key is scoped to read only access, the agent will be unable to execute write actions like Create project or Update prompt. We recommend using an admin key for full workflow automation, but you can restrict the key if you only want Ceven to perform analysis and reporting.
The agent can identify high value interactions in your logs and promote them to datapoints. For example, you can set up a rule that any session marked as a perfect response by a human reviewer is automatically sent to Humanloop as a gold set datapoint. This allows you to build a high quality benchmark dataset over time without manual data entry. The agent maps the input and output from the session directly into the datapoint format required by Humanloop, ensuring that your evaluation sets stay current as your product evolves.
Absolutely. This is a primary use case for Ceven. You can create a workflow that triggers whenever a new evaluation is created in Humanloop. The agent can then take that feedback, summarize the user complaint, and create a ticket in Linear, Jira, or Trello. It can also post a notification to a Slack channel so your engineering team sees the failure in real time. This ensures that the insights captured in Humanloop actually lead to product changes instead of sitting in a dashboard that nobody checks daily.
No. We treat your Humanloop data as strictly confidential. The data pulled from your sessions and experiments is used only to provide context for the specific task the agent is performing. It is stored in a temporary context window for the duration of the workflow execution and is not used to train any base models. We adhere to strict data isolation protocols, meaning your project data is never visible to other Ceven users, and your proprietary prompts remain your intellectual property throughout the entire automation process.
The Delete project action is permanent. Because this is a destructive operation, Ceven implements a safety check for any workflow that calls this tool. The agent will typically ask for a human confirmation via a notification before executing the deletion. We strongly recommend using the Archive project action instead if you think you might need the data later. Once a project is deleted via the API, Humanloop cannot recover the associated sessions or evaluations, so we treat this action with the highest level of caution in our logic.

Alternatives to Humanloop

Other tools that solve a similar problem. Ceven supports these too, so you can switch or run more than one at once.

LangSmith logoLangSmithWeights & Biases logoWeights & BiasesArize Phoenix logoArize Phoenix

Try Ceven on your stack

Plug Ceven on top of the tools you already run. Connect Humanloop and the rest of your stack, describe the outcome, and its agents handle the work end to end, days of it in minutes.

Get started for free