Startup owner using AI with this need - needless to say, a real problem. I've considered DIYing an internal service for this - even if we went with you we'd probably have an intern do a quick and dirty copy, which I rarely advocate for if I can offload to SAAS. I'm sure you've put a fair bit of work into this that goes well beyond the human interaction loop, but that's really all we need. Your entry price is steep (I'm afraid to ask what an enterprise use-case looks like) and this isn't complicated to make. We don't need to productize or have all the bells and whistles - just human interaction occasionally. Any amount of competition would wipe out your pricing, so no I would not want to pay for this.
thanks for the validation of the problem! totally open to feedback about the solution, and totally get that you only need something simple for now. I want to point out that we do have a pay-as-you-go tier which is $20 for 200 operations, and have a handful of indie devs finding this useful for back-office style automations.
ALSO - something I think about a lot - if a all/most of the HumanLayer SaaS backend was open source, would that change your thinking?
My gut feeling is with where we're headed we'll clear that 200 pretty quickly in production cases, so we'd be interested in bit higher volume. Our dev efforts would probably clear that 200/mo. If the flow/backend was open-source that'd be a total game changer for us as I see it as an integral part of our product.
edit: I want to add here that while ycomb companies like yourself may have VC backing, a lot of us don't and do consider 500+/mo. base price on a service that is operations-limited to be a lot. You need to decide who your target audience is, I may not be in that audience for your SAAS pricing. This seems like a service that a lot of people need, but it also stands out to me as a service that will be copied at an extravagantly lower price. We have truly entered software as a commodity when I, a non-AI engineer, can whip up something like this in a week using serverless infra and $0.0001/1k tokens with gpt-o mini.
that makes sense - and have wondered a lot even more generally about the price of software and what makes a hard problem hard. Like Amjad from Replit said on a podcast recently "can anyone build 'the next salesforce' in a world where anyone can build their own salesforce with coding agents and serverless infra"
I think in building this some of the things that folks decided they don't want to deal with is like, the state machine for escalations/routing/timeouts, and infrastructure to catch inbound emails and turn them into webhooks, or stitch a single agent's context window with multiple open slack threads, but you're right, that can all be solved by a software engineer with enough time and interest in solving the problem.
I will need to clear up the pricing page as it sounds like I didn't do a good job (great feedback thank you!) - it's basically $20/200 credits, and you can pay-as-you-go, and re-up for more whenever you want. We are early and delivering value is more important to me than extracting every dollar, especially out of a fellow founder who's early. If you geniunely find this useful, I would definitely chat and collaborate/partner to figure out something you think is fair, where you're getting value and you get to focus on your core competency. feel free to email me dexter at humanlayer dot dev
I was wondering: have you thought about automation bias or automation complacency [0]? Sticking with the drop-tables example: if you have an agent that works quite well, the human in the loop will nearly always approve the task. The human will then learn over time that the agent "can be trusted", and will stop reviewing the pings carefully. Hitting the "approve" button will become somewhat automated by the human, and the risky tasks won't be caught by the human anymore.
this is fascinating and resonates with me on a deep level. I'm surprised I haven't stumbled across this yet.
I think we have this problem with all AI systems, e.g. I have let cursor write wrong code from time to time and don't review it at the level I should...we need to solve that for every area of AI. Not a new problem but definitely about to get way more serious
Oh man, the API call for hl.human_as_tool() is a little ominous. Obviously approving a slack interaction is no big deal, but it does have a certain attitude towards humans that doesn't bode well for us...
the dystopian startups that use bounding boxes to observe workers in a warehouse and give the boss a report on how many breaks they took...they're here
P.S. nobody asked but since you made it this far - the next big problem in this space is fast becoming, what else do we need to be able to build these "headless" or "outer loop" AI agents? Most frameworks do a bad job of handling any tool call that would be asynchronous or long running (imagine an agent calling a tool and having to hang for hours or days while waiting for a response from a human). Rewiring existing frameworks to support this is either hard or impossible, because you have to
1. fire the async request,
2. store the current context window somewhere,
3. catch a webhook,
4. map it back to the original agent/context,
5. append the webhook response to the context window,
6. resume execution with the updated context window.
I have some ideas but I'll save that one for another post :) Thanks again for reading!
I'm considering this for a workflow agent and would be keen to hear thoughts on this process.
We're a medical device company, so we need to do ISO13485 quality assurance processes on changes to software and hardware.
I had already been thinking of using an LLM to help ensure we are surfacing all potential concerns and ensure they are addressed. Partly relying on the LLM, but really as a method to manage the workflow and confirm that our processes are being followed.
Any thoughts on if this might be a good solution? Or other suggestions by other HN users.
Looking forward to playing with HumanLayer. The slack integration looks a lot more useful for my workflows than other tools I've tried.
In the demo video and example, you show faked LinkIn messages integration. Do you have any recommendations for tools that can actually integrate with live LinkedIn messages?
thanks for sharing your experience so far! Like I said, we built this ourselves for another idea and it was painful.
I have played with Make and I actually chatted w/ the gotohuman guy on zoom a while back, I like his approach as well, he went straight to webhooks which makes sense for big production use cases
re: LinkedIn, no I don't know how to get agents to integrate with linkedin. I have tried a bunch of things, I know of some YC companies that tried this but I don't know how it went for them. Best I have gotten is using stagehand/dendrite with browserbase to do it with a browser, and then using humanlayer human_as_tool to ping me if it needs an MFA token or other inputs
Thanks for the reply! I've used a bunch of grey market 3rd party tools for LinkedIn automation. Most of them have some sort of API. I'll try integrating with HumanLayer.
so I played with MCP for a while last night and I think MCP is great as a layer to pull custom tools into the existing claude/desktop/chat experience. But at the end of the day its just basic agentic loop over tool calls.
If you want to tell a model to send a message to slack, sure, give it a slack tool and let it go wild. do you see a way how MCP applies for outer-loop or "headless" agents in a way that's any different from another tool-calling agent like langchain or crewai? IT seems like just another protocol for tool calling over the stdio wire (WHICH, TO BE CLEAR, I FIND SUPER DOPE)
Congrats on the launch, this is an interesting concept. It's somewhat akin to developers approving LLM generated code changes and pull requests. I feel much more comfortable with senior developers approving AI changes to our codebase, then letting loose an autonomous agent with no human oversight.
super relevant - yeah I think it was someone at anthropic who framed this as "cursor tab autocomplete, but for arbitrary API calls" - basically for everything else other than code
congrats on the launch dex! this is a problem that i've already seen come up a dozen times and many companies are building it internally in a variety of different ways. easier to buy vs. build for something like this imo, glad its being built!
Congrats Dex! Excited to see what people build with this + tools like Stripe's new agent payments SDK (issuing a payment seems like a great place to ask permission).
My favorite part of all this is that it’s inevitable. Someone has to solve agent adoption in whatever-the-environment-already-is. And nobody is doing this well at scale. Europe is mandating this. And even though Article 14 of the AI Act won’t be enforced until 2026, I’m glad projects like this are working ahead. Get after it, Dex!
What I don't understand from quickly skimming your description and homepage: Do you source/provide the humans in the loop? That's a good value add, but how do I automatically / manually vet how you do the routing?
great question - yeah i was actually heavily inspired by people trying to figure that stuff out on reddit back in july, and realizing that mapping that human input across slack, email, sms was never going to be a core focus for those agent frameworks
I work in operations/finance. I've experimented with integrating LLMs into my workflow. I would not feel comfortable 'handing the wheel' to an LLM to make actions autonomously. Something like this to be able to approve actions in batches, or approve anything external facing would be useful.
Hiring humans to do a consistent job is gonna be a nightmare and a limit on the scalability of the service. How are you defining your service level agreements?
This is the first new YC launch I've seen involving AI that I am extremely positive about. I have worked with systems implementing similar functionality ad-hoc already, but seeing it as a buy-in service - and one so easy to integrate - is really cool.
From what I've seen, this will bring the implementation needs for this kind of functionality down from "engineering team" to a single programmer.
hah thanks dude! I am very bullish on TS as the long term thing, Not to turn this into a language vs language thread but I spend a lot of time thinking about why ppl struggle so much with python...so far I came up with
concurrency abstractions keep changing (still transitioning / straddling sync+threads vs. asyncio) - this makes performance eng really hard
package management somehow less mature than JS - pip been around way longer than npm but JS got yarn/lockfiles before python got poetry
the types are fake (also true of typescript, I think this one is a wash)
the types are fake and newer. typing+pydantic is kinda bulky vs. TS having really strong native language support (even if only at compile time)
virtual environments!?! cmon how have we not solved this yet
wtf is a miniconda
VSCode has incredible TS support out of the box, python is via a community plugin, and not as many language server features
Looks amazing! (Also, I've known Dexter since before Human Layer and he's a force of nature. If you think this is interesting now, you're going to be amazed at where it goes)
Just an idea: having a little widget in the MacOS menu bar that pops up or sends you a notification to solve a human task wouldn't be so terrible either.
that's a great idea - I put together one example for getting an MFA code for a website, but the captcha thing "pull a human into a web session" is something I've wanted to play with for a while
reply