- Just Curious: Applied AI for PE and Mid-Market Leaders
- Posts
- Where AI Value Actually Came From in 2025 (Moves 1-3)
Where AI Value Actually Came From in 2025 (Moves 1-3)
11 case studies showing how teams moved decisions upstream, removed human middleware, and built systems that finish work


From Just Curious, a decision-support platform for leaders making high-stakes AI decisions.
Introduction
If Part I of our Year in Review established why so many AI initiatives stall, Part II shows what happens when teams get the prerequisites right.
Over the past year, we spoke with hundreds of applied AI practitioners and conducted more than 100 in-depth interviews. From that body of work, we identified 17 organizations that moved beyond pilots and built production systems that changed throughput, cost structure, risk, or revenue in measurable ways.
Across these cases, AI value did not come from better models or novel tools. It came from a small number of repeatable operational moves—decisions about where automation belonged, how humans were positioned in the loop, and which sources of latency actually mattered.
Taken together, the 17 case studies in Part II collapse into six operational moves that consistently determined whether AI efforts compounded or quietly stalled.
Because of length, we’re publishing Part II across two issues.
Today’s edition covers the first 11 case studies across the first three moves:
moving decisions upstream,
removing human middleware,
and designing systems that finish work end-to-end.
These cases include:
How a Mid-Sized Restoration Company Cut Insurance Rebuttal Time from 3 Hours to 10 Minutes
How a Hedge Fund Raised First-Pass Compliance Accuracy from 30% to 95%
How a Regional ISP Cut Technician Idle Time by 45 Minutes per Job
How an E-Commerce Eyewear Brand Cut Prescription Verification from Days to Seconds
How a Contract Manufacturer Cut Quote Turnaround from Days to Seconds
How a PE-Backed Services Roll-Up Eliminated 40% of Redundant Work by Auditing Real Workflows
How a Midwest Manufacturer Replaced 30-Tab Excel Reporting with Real-Time Job Visibility
How an E-Commerce Company Cut Product Classification Time from 24 Hours to 30 Seconds
How a Billboard Marketplace Cut Content Approval Latency to Near-Instant
How a PE Investment Team Cut Deal Screening Time by Turning Institutional Memory into a System
How a Hedge Fund Cut Equity Research Prep Time by 80% Without Losing Rigor
Tomorrow’s edition will cover the remaining six case studies, focused on deterministic systems in high-stakes domains, adoption dynamics, and how teams compressed time-to-value without large platform rebuilds.
What follows in this issue are concrete examples—drawn from private equity, manufacturing, insurance, healthcare, education, and professional services—of how teams moved decisions upstream, removed human middleware, and built systems that completed work end to end.
They are presented as both inspiration and evidence.
Executive Summary
From Diagnosis to Application: How AI Actually Created Value in Practice
Across these cases, one conclusion stands out: AI value did not come from better models, novel tools, or ambitious roadmaps. It came from a small number of repeatable operational moves, executed precisely.
In practice, value came from concrete shifts:
A restoration company cut insurance rebuttal time from hours to minutes by moving negotiation upstream.
A hedge fund raised first-pass compliance accuracy from 30% to 95% by replacing downstream review with pre-validation.
A preschool operator expanded EBITDA by optimizing staffing hour by hour using deterministic optimization rather than generative AI.
Different industries. Different tools. The same underlying operational moves.
The organizations featured here varied widely in size, industry, and technical maturity—but the cases resolve into a small set of operational patterns that explain why some efforts compounded while others quietly stalled.
AI value was unlocked by redesigning where decisions happen
In the most successful deployments, teams did not start by accelerating existing workflows. They started by identifying where decisions were delayed, distorted, or deferred, and redesigned that point in the system.
Often, the perceived problem was downstream: slow collections, reporting lag, compliance review, or manual reconciliation. In practice, audits revealed that the true constraint lived upstream; in intake, preparation, validation, or handoffs between systems. By intervening earlier, teams unlocked both speed and capacity without adding headcount.
High-leverage systems replaced human mediation, not human judgment
Across cases, the highest ROI came from eliminating humans as intermediaries, not decision-makers. These systems removed people from low-value translation work—copying data between tools, reconciling formats, checking rules—while preserving human oversight where judgment, trust, or accountability mattered.
Successful teams were explicit about this boundary. They built systems that completed tasks autonomously by default, surfaced exceptions clearly, and showed their work. This design both accelerated throughput and increased adoption, because operators could see why the system acted.
Designing systems to complete work, not assist it
A third pattern appeared across many of the highest-leverage cases: teams stopped building tools that helped humans do work and instead built systems that completed the work by default.
Rather than copilots, chat wrappers, or suggestion engines, these teams designed workflows that ran end-to-end automatically and escalated only low-confidence exceptions. Humans shifted from generating outputs to validating results.
This distinction mattered, as systems designed for task completion reduced coordination overhead, shortened cycle times, and earned adoption faster than assistive tools that required constant prompting, interpretation, or follow-up.
What these cases collectively show
Taken together, these cases show that AI advantage is no longer theoretical. Organizations that understand how work actually happens, intervene at the right decision points, and apply AI selectively are already compounding operational gains, while others, often with similar tools and resources, quietly stall.
This issue is long by design. If your email client clips the message, you can read the full piece here:
Move 1: Move the Decision Upstream
How a Mid-Sized Restoration Company Cut Insurance Rebuttal Time from 3 Hours to 10 Minutes, with Jack Weissenberger, Ciridae
Starting Conditions (Status Quo)
A mid-sized home restoration company was throttled by its insurance negotiation process. The CEO and COO spent 20–30% of their time manually comparing their internal proposals against insurance adjuster quotes to settle payouts. This required reconciling thousands of line items by hand to identify discrepancies. The tedious nature of the work—taking up to three hours per proposal—capped deal volume and diverted executive focus from growth to data entry. As Jack Weissenberger described, this work represented “the core bottleneck of their business… taking 20 to 30% of the CEO and COO’s time who were experts at this.”
The Intervention
While the client initially requested a solution for Accounts Receivable, an on-site audit revealed the true bottleneck was the upstream rebuttal workflow. Ciridae deployed a custom AI automation designed specifically to ingest and compare disparate proposal formats, shifting the focus from collections to the negotiation phase. This reframing surfaced the negotiation workflow—not invoicing—as the primary limiter of cash flow and throughput.
The Redesign
The manual “stare and compare” workflow was replaced with an automated ingestion engine. The system takes the restoration firm’s proposal and the insurance carrier’s PDF as inputs. It automatically matches line items, identifies metadata discrepancies, and flags missing or added items. As Weissenberger explained, “the inputs are their proposal and the insurance proposal and the output is a fully annotated PDF and system that walks through here all of the line items that match exactly [and] here are the line items that are missing from your proposal.” This allowed operators to move from manual hunting to structured verification.
The After State
Processing time per proposal dropped from three hours to under ten minutes. The system identified hundreds to thousands of dollars in missed value per job by eliminating human fatigue errors common in manual review. Leadership reclaimed 30% of their workweek, allowing the firm to increase job volume without adding headcount. As a result, “we were able to find hundreds to thousands of dollars per proposal… but what’s really driving this is that it’s allowing them to do more jobs because this is no longer the bottleneck.”
What They Learned (and Would Do Differently)
The initial problem diagnosis is often wrong. The client believed they had a collections issue, but the audit proved the upstream negotiation phase was the actual constraint on cash flow. This case illustrates the power of moving the decision upstream: by redesigning validation and reconciliation before invoicing ever occurred, the firm eliminated downstream friction and unlocked both speed and capacity.
Who This Is Relevant For
High-volume service businesses (restoration, construction, legal) where revenue realization depends on reconciling complex, line-item-heavy documents between two parties.
Operator Takeaways
Audit workflows before building; the true constraint often lives upstream of where teams think the problem is.
Move decisions earlier in the workflow by automating validation and reconciliation before execution or billing.
Replace human middleware in line-item reconciliation while keeping humans focused on judgment and negotiation.
How a Wind Power Operator Reduced Turbine Downtime by Fixing Pre-Dispatch Decisions, with Jozef Petro, Sudolabs
Starting Conditions (Status Quo)
A wind power plant operator faced revenue losses due to prolonged turbine downtime, where repairs often took weeks due to logistical inefficiencies. Initial discovery revealed a primary bottleneck: field technicians frequently arrived on-site without the necessary parts, turning hours-long jobs into multi-day delays and forcing excessive manual cross-referencing of error codes against dense service manuals.
The operator initially wanted predictive maintenance models, but the team found that existing field reports lacked the data quality required to support them. As Jozef Petro explained, “there is a lot of back and forth,” and even when parts were available in the warehouse, technicians still had to order or retrieve them, meaning “the downtime of the turbine that could be… a couple of hours becomes couple of days or multiple times what it could be.”
The Intervention
The team paused the predictive data science initiative to focus on immediate workflow optimization. They deployed a solution combining Large Language Models (LLMs) and vector search to bridge the gap between IoT data and human execution. The intervention specifically targeted pre-dispatch preparation and on-site information retrieval, rather than attempting to automate the repair itself.
The Redesign
The workflow shifted from reactive to probabilistic. The system now ingests sensor data and error codes to generate a list of likely required parts before the technician leaves the warehouse, preventing return trips. On-site, manual document searching was replaced by a conversational interface; technicians describe physical symptoms or error lights, and the model synthesizes relevant repair procedures from the manuals instantly. As Petro described, “we are consuming all this data and then researching the manuals… completely automatically without human intervention to suggest the parts before the repair,” using “a combination of LLM and vector search” to pre-diagnose and recommend the necessary inventory.
The After State
Time spent researching manuals on-site dropped by 30 percent. Overall maintenance time was reduced by high single digits within the first quarter, translating to hundreds of thousands of dollars in annual savings per plant. The project recouped its investment within one year. As Petro noted, “in terms of maintenance time reduction… in the first quarter [you] get to higher single digits of time saved,” which “translates to higher hundreds of thousands of dollars a year for… an average sized power plant.”
What They Learned (and Would Do Differently)
You cannot leapfrog to predictive maintenance if your foundational reporting is flawed. The team learned that optimizing the manual repair process was a necessary precursor to gathering the high-quality data needed for future predictive models. This case demonstrates the leverage of moving the decision upstream: by redesigning pre-dispatch preparation, the operator eliminated downstream delays without automating the repair itself.
Who This Is Relevant For
Energy, manufacturing, and field service organizations managing complex assets where downtime directly impacts revenue.
Operator Takeaways
Move decision-making upstream by fixing pre-dispatch preparation before pursuing predictive maintenance.
Use vector search to translate IoT error codes into concrete inventory decisions prior to technician dispatch.
Design systems to complete preparation tasks automatically and escalate exceptions, rather than assisting technicians mid-repair.
How a Hedge Fund Raised First-Pass Compliance Accuracy from 30% to 95% by Moving Review Upstream, with Brandon Gell, Every
Starting Conditions (Status Quo)
A large hedge fund operated with a small compliance team that functioned as a mandatory bottleneck for all external communication. Every presentation, report, and piece of content leaving the firm had to be manually reviewed by this group, consuming hours each day and reducing the compliance function to document processing rather than strategic risk management, an imbalance Brandon Gell later described by noting that “their compliance team was handling hours and hours a day of processing information… It’s just an example of where somebody could be doing so much more strategic work, but they didn’t even realize that their job is actually to do strategic work.” Because the review process was entirely manual and downstream, initial submissions from other departments were frequently flawed, with a compliance rate of only about 30 percent upon arrival.
The Intervention
The firm moved beyond simply making the compliance team faster. Instead, they decentralized the initial review process using AI. They built a custom compliance GPT trained on the firm’s specific regulatory rules and guidelines. Crucially, they did not just deploy this to the compliance officers; they distributed the tool to the investment and marketing teams—the creators of the content—allowing them to self-service the initial compliance check before submitting work for final approval,.
The Redesign
The workflow shifted from a "gatekeeper" model to a "pre-validation" model. Previously, content creators sent raw drafts to compliance and waited for redlines. In the new workflow, creators run their documents through the compliance GPT first. As Gell explained, “we created a compliance GPT so that anybody could just use this GPT, put the piece of content in and get feedback exactly like they would from the compliance team themselves to edit that content themselves so it was able to be shared.” The AI provides immediate feedback, flagging issues and suggesting edits exactly as a compliance officer would. The human compliance team now receives documents that have already been vetted, shifting their role from line-by-line editing to final verification and strategic oversight.
The After State
The impact on workflow efficiency was drastic. Incoming submissions improved from 30 percent compliance to 95 percent compliance before they ever reached a human reviewer, an outcome Gell summarized by noting that “they could go from like 30 % compliant to 95 % compliant.” This eliminated hours of back-and-forth correction cycles. The compliance team was able to shift focus from processing information to protecting the business, effectively reclaiming their time for higher-level strategic work.
What They Learned (and Would Do Differently)
Employees often confuse their tasks with their value. The compliance team initially believed reviewing documents was their job. The process revealed that their actual value was protecting the firm, and automating the review task allowed them to finally focus on that core objective. This case illustrates the leverage of moving review upstream and replacing human middleware: error-checking was shifted to the point of creation, while judgment and accountability remained with compliance.
Who This Is Relevant For
Hedge funds, private equity firms, legal departments, and highly regulated industries where small, high-cost teams are buried under high-volume document review.
Operator Takeaways
Move compliance review upstream by shifting error-checking to content creation rather than final approval.
Replace human middleware in gatekeeper workflows; automate tactical validation while preserving strategic oversight.
Deploy self-service validation tools early so teams learn to pre-correct before formal review.
How an E-Commerce Eyewear Brand Cut Prescription Verification from Days to Seconds, with Nitesh Pant, DevDash Labs
Starting Conditions (Status Quo)
An e-commerce eyewear company struggled with a manual bottleneck in selling prescription sunglasses. The existing workflow required customers to upload prescriptions, which were then routed to an external contractor for verification. This contractor manually checked expiration dates and frame compatibility before emailing a purchase link back to the customer. The process took anywhere from hours to days, creating significant friction that killed impulse purchases and stalled revenue growth.
As Pant noted, “This was a case of classic founder driven company where the founder is now stretched to their limit and they just simply don't have more hours in a day to put to do the real high-value work… And this was something that the founders would have to sometimes [do] during the weekends because there was just an overwhelming number of orders that came in.”
The Intervention
The team identified prescription verification as a high-leverage target because no off-the-shelf software existed to solve it. They decided to build a custom solution that combined AI capabilities with rigid software logic. The intervention explicitly avoided relying solely on generative AI for the entire process, instead using AI strictly for data extraction and communication while relying on deterministic algorithms for the medical verification logic.
The Redesign
The workflow moved from asynchronous human review to real-time automated validation. The system now ingests prescription uploads via AWS, using Optical Character Recognition (OCR) to extract data and validate expiration dates. It then passes that data to a hard-coded algorithm that checks if the selected frames can physically support the prescription parameters. As Pant described, “So now what we designed was a system where the customer uploads their prescription… there’s an OCR happens that reads the prescription, validates it… then it looks at what the customer requested… And then in that frame and given this prescription, can it, does it fit or not… Instead of sending an email back, [we] change the page on the Shopify website to go to your desired shopping page.” The system automates 95 percent of transactions, routing only the 5 percent with low-confidence OCR scores to a human for manual review.
The After State
Verification time dropped from days to seconds, allowing customers to checkout immediately. This speed created an estimated $200,000 in incremental annual revenue by capturing purchases that previously churned during the wait period. As Pant summarized the impact, “You're catching some of those impulse purchases as well because there's no need to wait for the verification… over the year they'll have over $200,000 have increased sales from that increased convert due to a faster response time.” The system achieved a 97 percent success rate in routing customers correctly without human intervention.
What They Learned (and Would Do Differently)
Client-facing AI demands significantly more robust infrastructure than internal tools because latency and error handling directly impact revenue. They also learned that AI must be paired with deterministic algorithms for logic-heavy tasks; you cannot rely on an LLM to "think" through medical compliance rules accurately. This case demonstrates the power of moving the decision upstream and designing for task completion: validation happens instantly at checkout, with deterministic logic handling compliance and humans reserved only for low-confidence exceptions.
Who This Is Relevant For
Specialized e-commerce merchants, digital health companies, and regulated service providers where technical validation is a prerequisite for a transaction.
Operator Takeaways
Move validation upstream to the point of purchase; latency at checkout directly converts to lost revenue.
Favor deterministic systems for compliance logic, using AI only where unstructured data ingestion is unavoidable.
Design systems for task completion by default, with humans handling only low-confidence exceptions.
Move 2: Replace Human Middleware
How a Regional ISP Cut Technician Idle Time by 45 Minutes per Job by Automating Dispatch Validation, with Jordan Gurrieri, BlueLabel
Starting Conditions (Status Quo)
A regional fiber-based ISP faced a costly bottleneck in its field operations. A 10-person back-office dispatch team was required to manually validate equipment health checks before technicians could close out jobs. Field technicians regularly sat idle for 30 to 45 minutes waiting for dispatch confirmation, limiting daily site visits. As Jordan Gurrieri explained, “their technicians in the field would typically… have to call into the dispatch office… for equipment health checks,” and that idle time was expensive because “if this person did 15 minutes of work, but is sitting around for another 30 to 45 waiting to be able to leave and go to the next site,” the downtime quickly compounded.
The Intervention
The client initially piloted a solution using a Custom GPT, but it failed to deliver consistent results when scaled beyond a few users. The team made the specific decision to scrap the chat-wrapper approach and rebuild the solution at the API layer using the OpenAI Agents API. This moved the intervention from a text-generation experiment to a reliable software integration capable of executing backend commands.
The Redesign
The workflow shifted from a “call-and-wait” model to an automated self-service model. Two tools were deployed: a voice-enabled AI assistant for troubleshooting and an automated dispatch workflow. Instead of calling a human, the technician interacts with the AI agent, which triages the issue, runs actual command-line health checks on the hardware, validates functionality, and closes the work order. As Gurrieri described, “the second tool was an automated dispatch workflow,” where the AI assistant could “validate, run a couple of commands to check if the hardware was back up and running, do the health checks, complete and close out the work order,” and then release the technician to move on to the next site.
The After State
Technician downtime dropped significantly, reducing time-on-site by nearly 50 percent. This efficiency gain allowed technicians to complete approximately 75 percent more visits per day. The firm reduced back-office costs by upwards of $10,000 monthly per person by minimizing the need for manual dispatchers, successfully scaling the system to 150 technicians. As Gurrieri noted, “they were able to save… upwards of 10,000 monthly per person” in the back office, while enabling “maybe 75% more visits per day on site… with less technicians.”
What They Learned (and Would Do Differently)
Custom GPTs are often traps for enterprise workflows. While they work for individual productivity, they lack the consistency required for large teams; rebuilding at the API layer was necessary to ensure reliability across 150 users. Additionally, involving tenured field technicians in the design phase was critical to adoption, as they understood the physical realities of the job better than headquarters. This case shows the compounding impact of moving validation upstream and replacing human middleware: dispatch confirmation became an automated decision, not a coordination task, freeing both technicians and back-office staff simultaneously.
Who This Is Relevant For
ISPs, utilities, and field service organizations where highly paid technicians spend significant time on hold with a central office for administrative validation.
Operator Takeaways
Move validation decisions upstream so work can close without human coordination.
Replace human middleware with systems that complete tasks autonomously and escalate only exceptions.
Build for operational reliability at scale by integrating AI at the API layer, not through chat wrappers.
How a Private Credit Firm Increased Deal Throughput by 50% Without Adding Headcount, with Osman Ghandour, Soal Labs
Starting Conditions (Status Quo)
A private credit firm faced significant pressure to increase deal volume and returns but was constrained by disjointed operations. The firm lacked a unified data system, relying primarily on Excel checklists and SharePoint folders to manage complex transactions.
Critical data did not bridge the gap between pre-close origination and post-close portfolio management, forcing high-value investment professionals to manually re-enter information and delaying quarterly reporting by several days. The firm needed to scale deal capacity without doubling headcount. As Osman Ghandour described, leadership was explicit that “they need to do more deals. They need to do better deals… we don’t want to take a bunch of time and a bunch of money to go and double our team size,” which pushed the firm to look for ways to systematize operations rather than add staff.
The Intervention
The team rejected the idea of simply buying another point solution. Instead, they built a custom underwriting backbone designed to act as a single source of truth. This system was engineered to connect existing disparate platforms, specifically integrating SharePoint, iLevel, and Dynamics CRM. The intervention focused on creating a unified data layer that synced information across these systems automatically, rather than replacing the tools the teams were already using.
The Redesign
The core operational shift was standardization. Before automation could occur, Managing Directors had to agree on a unified deal process, moving away from idiosyncratic, individual workflows. Once standardized, the new system automated specific high-friction tasks: extracting data from Confidential Information Memorandums (CIMs), generating deal memos (previously manual PowerPoint exercises), and creating one-pagers. Data now flowed automatically from origination into the portfolio monitoring system, eliminating manual re-entry. As Ghandour explained, “we built an underwriting system for them… we connected that system to their internal… SharePoint instance, to iLevel, their portfolio monitoring tool, to their CRM,” creating “a single source of truth for your data” where information stayed in sync across systems.
The After State
The time required to generate deal documentation dropped by 70 percent at each stage of the underwriting process. This efficiency gain allowed the firm to screen and execute on 50 percent more deals without adding staff. By removing administrative friction, the firm significantly improved its “speed to conviction,” allowing investment committees to reach decisions faster. As Ghandour noted, “that time on average went down by 70%,” and the firm found it could “look at 50% more deals and actually… execute on 50% more deals.”
What They Learned (and Would Do Differently)
Systematization requires standardization first. The biggest hurdle was not technical but cultural: getting senior leaders to agree on a single way to run a deal. Bringing subject matter experts into the design process early is critical to ensure they feel ownership over the standardized workflow rather than having it imposed upon them. This case demonstrates how replacing human middleware with a unified data backbone—rather than adding tools or people—can unlock deal throughput without increasing headcount.
Who This Is Relevant For
Private credit and private equity firms looking to scale Assets Under Management (AUM) and deal volume without a linear increase in headcount.
Operator Takeaways
Standardize decision workflows across leadership before attempting automation; fragmentation blocks scale.
Replace human middleware between origination and portfolio systems with a single source of truth.
Measure success by speed to conviction, not just time saved, faster decisions are the real compounding advantage.
How a PE-Backed Services Roll-Up Eliminated 40% of Redundant Work by Auditing Real Workflows, with Felix Rosner, Marble
Starting Conditions (Status Quo)
A client services team of 50 within a private equity-backed roll-up was operating on a legacy tech stack with zero visibility into actual workflows. Leadership suspected inefficiencies but relied entirely on anecdotal manager accounts to understand how work got done. The team was visibly busy, yet the firm lacked any ground-truth data to distinguish productive work from “swivel-chair” redundancy, making it impossible to identify where margin was leaking. As Felix Rosner described, following an acquisition leadership admitted, “we don’t really know what those people are doing the entire day,” noting that employees were working on “super kind of legacy, legacy tech stacks” without the capacity to truly shadow or understand their work.
The Intervention
The firm rejected traditional consulting methods like manual shadowing. Instead, they deployed desktop agents to capture thousands of hours of keystroke and screen data across the team. This moved the diagnosis from interview-based assumptions to empirical evidence. They identified that 40 percent of the work was redundant—primarily high-volume, standardized email requests that required manual entry into legacy databases without APIs—revealing that humans were functioning as middleware between systems rather than performing client-facing work.
The Redesign
The data revealed a specific bottleneck: simple requests like balance checks or power of attorney certificates took 20 minutes to process due to slow legacy systems. The team deployed automations using tools like N8N and Retool to handle these tasks via “computer use” logic, interacting with the legacy interface directly. Furthermore, the audit surfaced that individual employees had secretly built their own internal tools to speed up tasks. The redesign involved democratizing these hidden “shadow tools” across the entire department, instantly standardizing best practices that were previously siloed and eliminating the need for manual coordination between email, legacy systems, and client responses. As Rosner explained, “we deployed the agents on the devices of the people and we captured like 10,000s of hours of work,” which made it possible to see “which tools that got people spending time on, which tasks are people spending time on,” and then automate the most repetitive work.
The After State
Automating the account reconciliation process alone saved over 100 hours per month. Standardized email request processing dropped from 20 minutes per ticket to near-instant execution. In total, the intervention projected returning 200 to 250 hours per month to the team, allowing operators to shift from data entry to actual client relationship management. As Rosner noted, the real value was ensuring time was spent on “actual strategic work, actually manning customer relationships, actually hopping on a call… to discuss matters of strategic importance.”
What They Learned (and Would Do Differently)
Efficiency gains often already exist inside the building. One of the biggest unlocks wasn’t a new AI agent, but discovering a script one high-performer had written and rolling it out to the rest of the group. They learned that relying on leadership interviews to map processes is fundamentally flawed; only ground-level data reveals the reality of work. This case shows that replacing human middleware requires first observing real workflows, not abstractions or org charts..
Who This Is Relevant For
PE roll-ups, debt collection agencies, staffing firms, and high-volume service centers relying on legacy, non-API software systems.
Operator Takeaways
Stop mapping processes via manager interviews; capture ground-truth data from frontline workflows to identify real constraints.
Surface and scale “shadow tools” built by high performers before investing in new platforms.
Target swivel-chair workflows where humans act as APIs between systems; eliminating these delivers the fastest margin impact.
Move 3: Design for Task Completion, Not Assistance
How an E-Commerce Company Cut Product Classification Time from 24 Hours to 30 Seconds, with Chris Taylor, Fractional AI
Starting Conditions (Status Quo)
A private equity-owned e-commerce company relied on an offshore BPO to manually map customer shopping lists to the company’s internal product taxonomy. This process was a significant operating expense and created a strategic vulnerability by locking process knowledge outside the firm. More critically, it imposed a 24-hour turnaround time that degraded the customer experience and slowed transaction velocity. As Chris Taylor described, the company “used to rely on [a] BPO to do this… a shopping list comes, they outsource it, it takes 24 plus hour turnaround time, user experience not great,” while cost was “a big line item for them… a significant portion of their company’s annual expenses.”
The Intervention
The team deployed a generative AI solution to replace the external vendor. Crucially, they narrowed the scope during the planning phase by explicitly choosing speed-to-production over completeness. Rather than attempting to automate the entire pipeline immediately, they explicitly excluded the complex web scraping component—often a technical stumbling block—postponing it to a second phase. This allowed the team to focus entirely on the core logic of mapping inputs to the standard taxonomy.
The Redesign
The workflow shifted from a “human-first” to an “AI-first” model. Instead of routing lists to the BPO, the system now feeds images, PDFs, or spreadsheets directly into the AI model. The system acts as a “task completer” rather than a co-pilot, automatically generating the mapped taxonomy and assigning a confidence score, completing the work by default rather than assisting a human in doing it. As Taylor explained, “the new default is instead of that… new list being routed to the offshore team, that new list goes through the new AI system and out comes the output,” with confidence scores used to flag only uncertain cases for QA. High-confidence outputs are processed instantly, while low-confidence items are flagged for internal human review. Corrections made by the QA team are saved to a database, creating a flywheel that retrains the system automatically.
The After State
Turnaround time dropped from over 24 hours to approximately 30 seconds, fundamentally changing how quickly transactions could proceed without human intervention. The AI solution achieved higher accuracy rates than the human BPO. Costs were reduced by roughly 84 percent, shifting the P&L impact from heavy variable labor to minimal compute and targeted internal review. As Taylor noted, “the accuracy of the AI system was higher than the BPO accuracy,” while turnaround time fell to “basically 30 seconds instead of 24 plus hours,” with “84% [cost reduction] in year one.”
What They Learned (and Would Do Differently)
Aggressive descoping is vital for speed. By intentionally delaying the web scraping requirement, the team avoided a common technical quagmire and reached production faster. Furthermore, building a system that completes tasks rather than assisting humans eliminated the need for extensive behavior modification or training. This case shows that confidence-scored task completion scales faster than copilot-style assistance in high-volume workflows.
Who This Is Relevant For
E-commerce, logistics, and data-heavy service firms currently using BPOs for categorization, data entry, or taxonomy mapping.
Operator Takeaways
Descope technical hazards (like scraping) in Phase 1 to accelerate time-to-value on the core logic.
Design systems as task completers that run autonomously by default, involving humans only for low-confidence exceptions.
Utilize confidence scores to filter workflows, ensuring human capital is spent only on low-confidence edge cases.
How a Billboard Marketplace Cut Content Approval Latency to Near-Instant by Automating First-Pass Moderation, with Arman Hezarkhani, Tenex
Starting Conditions (Status Quo)
A billboard advertising company operated a digital platform allowing users to upload content for display. However, every uploaded image required approval through two distinct layers of human moderation. This manual review process was expensive and slow, creating a bottleneck that delayed the time between purchase and display, thereby stalling revenue realization. The client initially requested broad “AI implementation” without a specific target workflow, necessitating a diagnostic phase to identify where automation would yield the highest return. As Arman Hezarkhani explained, “once a billboard is uploaded, it has to be approved by two layers of human moderation,” a process that was “expensive, but… also time consuming,” meaning it directly constrained revenue realization.
The Intervention
The team mapped business goals to technical opportunities, identifying content moderation as a high-leverage target. They trained a custom model specifically to process billboard imagery, optimizing for speed and confidence rather than full automation. The intervention did not attempt to replace all human oversight immediately; instead, it focused on automating the first layer of review to filter content with high confidence.
The Redesign
The workflow shifted from a mandatory two-stage manual review to an AI-first filter, moving the approval decision upstream to the moment content was uploaded rather than after human review cycles. The algorithm processes uploads instantly, achieving 96 percent accuracy against human benchmarks. The second moderation layer—controlled by billboard owners—remained but became optional. Owners could now choose to trust the algorithm or tune the model’s parameters for their specific inventory, effectively delegating the initial approval decision to the AI while retaining control. As Hezarkhani noted, “we were able to save time and money on that by basically building an algorithm that is 96% accurate when compared to the human moderators.”
The After State
The intervention generated immediate revenue lift by removing the latency between upload and display, the point at which transactions had previously stalled. The project moved from concept to execution in under a month. Post-implementation analysis revealed that in discrepancies between the AI and humans, the human moderators were frequently the source of error, validating the shift to automation. As Hezarkhani observed, “a lot of that 4% [error rate] actually was incorrect on the human moderator side,” revealing that “the human moderators… are actually not super accurate.”
What They Learned (and Would Do Differently)
Human “ground truth” is often unreliable. The team discovered that the manual baseline they were trying to match was actually flawed, as the AI identified errors human moderators missed. Trusting the model required proving that human review was not the gold standard.This case demonstrates that accuracy constraints are often misdiagnosed, and that automating the first approval decision can outperform slower, human-heavy workflows without increasing risk.
Who This Is Relevant For
Marketplaces, media platforms, and user-generated content sites where revenue depends on the speed of approving uploads.
Operator Takeaways
Audit human accuracy first; the manual baseline is often lower than assumed, making AI viability easier to prove.
Automate the first layer of defense to reduce latency, while keeping downstream human review optional for edge cases and governance.
Target revenue-blocking workflows where speed of approval directly correlates to transaction volume.
How a PE Investment Team Cut Deal Screening Time by Turning Institutional Memory into a New Deal Engine, with Natalia Quintero, Every
Starting Conditions (Status Quo)
A private equity firm’s investment team lacked a coordinated approach to thesis formation. While the firm possessed years of rich proprietary data, investors manually sifted through internal notes and external research to draft initial deal memos. This process was isolated and inefficient, creating a bottleneck that limited the number of companies the team could critically evaluate. As one investor described it, the work “went from being a very sort of like a lonely experience of you identify a company that you're interested in… sort of swifting through and doing that manually,” despite the abundance of internal resources available. The firm needed to leverage its institutional knowledge to screen deals faster without sacrificing diligence depth.
The Intervention
The team consciously decided against building a complex, standalone software product, viewing speed-to-value as more important than architectural control. Instead, they chose to build the solution directly within their existing horizontal Large Language Model (ChatGPT), treating it as an execution environment rather than a chat interface. They recognized that their enterprise LLM license was severely underutilized and that configuring it with specific context would yield faster results than a custom build. The focus was on creating a structured workflow that integrated private internal data with the model's reasoning capabilities.
The Redesign
The workflow shifted from manual compilation to AI-assisted synthesis. The system was designed to complete the first pass of thesis formation by default, not assist investors in drafting from scratch. An investor now inputs a target company or sector, and the system scans the firm’s internal data repository to stress-test the thesis. As Quintero explained, the value came from “being able to do a lot of preliminary research with an AI workflow that understands who you are… and then to add to that the rich repository of information that your firm has collected over the years.” The AI drafts a V1 deal memo that highlights alignment with historical investment criteria and flags potential risks based on past firm experiences. This creates a standardized starting point for partner discussions, moving the human effort from data gathering to strategic judgment.
The After State
The firm significantly expanded its screening capacity, allowing investors to evaluate more companies without increasing prep time per opportunity. The deal memo process transformed from a solitary drafting exercise into a data-backed validation workflow. As a result, “the amount of companies that they could really think critically on and the kind of conversations they could have internally were richer and more interesting,” with institutional knowledge automatically surfaced to ground early discussions.
What They Learned (and Would Do Differently)
Horizontal LLMs are often sufficient. Teams frequently over-engineer solutions by buying niche tools or building custom apps when a well-configured environment within ChatGPT or Claude can solve the problem with less technical debt. In this case, avoiding a custom build was not a shortcut, it was the decision that made the system usable at all.
Who This Is Relevant For
Private equity firms, hedge funds, and investment teams managing significant proprietary datasets.
Operator Takeaways
Exhaust the capabilities of your enterprise LLM license as an execution environment before commissioning custom software builds.
Structure prompts to ingest internal data sets to differentiate outputs from generic public insights.
Design workflows that produce a "V1 draft" to shift human effort from creation to review and strategy.
How a Hedge Fund Cut Equity Research Prep Time by 80% by Systematizing Pre-Call and Post-Call Analysis, with Jay Singh, Casper Studios
Starting Conditions (Status Quo)
A hedge fund managing hundreds of public equities faced a bottleneck in its “corporate access” workflow. Before speaking with company executives, analysts spent 30 to 45 minutes manually synthesizing information to draft questions. This process required sprinting to download 10-Qs and earnings transcripts from investor relations sites while simultaneously cross-referencing internal financial models and historical notes stored in OneNote. The high friction of manual data gathering limited the number of equities the team could effectively monitor and cover. As Jay Singh described, “you have a ticker in mind, you have a call that’s like in 30 minutes with… the CFO of that company,” and analysts are “sprinting and downloading a bunch of… different PDFs… getting their 10-Q, getting their earnings transcript,” a process that “could have taken maybe half an hour to 45 minutes per equity.”
The Intervention
The team explicitly rejected a monolithic “black box” approach in favor of a modular build that preserved analyst control. They deployed a custom system designed to integrate external public data with the firm’s internal proprietary data, specifically connecting SharePoint, OneNote, and Excel financial models. The initial scope was strictly limited to automating pre-call Q&A generation, deliberately avoiding any attempt to automate the investment decision itself.
The Redesign
The workflow shifted from manual hunting to system-driven preparation and verification. The system now ingests relevant documents and internal notes to propose questions before a call. Post-call, a second module transcribes the discussion, and a third module compares that transcript against the firm’s financial models. The AI flags discrepancies—such as a CFO’s comment on margins contradicting the model’s projections—and suggests a buy/sell/hold stance, which the analyst validates by reviewing the system’s cited sources rather than re-assembling the analysis from scratch. As Singh explained, “the first module that we helped them build is Q&A generation,” designed to “plug into all the different data sets that they needed publicly, and also the data sets needed within their own company, their SharePoint, their OneNote,” to surface the right questions to ask.
The After State
Preparation time for pre-call Q&A dropped by approximately 80 percent, falling to about five minutes per equity. Post-call analysis time was reduced by 50 percent. This efficiency gain allowed the investment team to manage a larger portfolio of equities without increasing headcount, as the administrative burden of synthesis was largely automated. As Singh noted, the system was “effectively cutting down the process for the Q&A generation by 80%” and “bringing down the time saving for the analysis by maybe 50%,” enabling analysts to manage more public equities without repeating the same synthesis work.
What They Learned (and Would Do Differently)
Analyst trust required full observability. The system had to “show its work”—linking specific transcript quotes to model variances—so analysts could verify the logic. Furthermore, building “singles and doubles” (discrete modules) proved faster and more effective than attempting a “moonshot” automation that handles everything at once.
Who This Is Relevant For
Hedge funds, private equity firms, and research-heavy financial institutions managing large portfolios where decision-making relies on synthesizing internal and external data.
Operator Takeaways
Target “singles and doubles” by automating discrete prep tasks rather than trying to automate the final investment decision immediately.
Integrate internal unstructured data (notes, emails) with external filings to prevent generic AI outputs.
Force the system to cite specific data points for every recommendation so analysts can audit the logic without re-doing the work.
About Just Curious
Just Curious is a decision-support platform for leaders making high-stakes AI decisions.
We work with private equity firms and middle-market operators who are past experimentation and need clarity on what actually works, before time, budget, or momentum lock them into the wrong path.
We’ve built a curated network of applied AI experts; operators, builders, and technical strategists who have deployed AI systems in production inside real businesses. When a team is evaluating an AI initiative, we run the need through that network to surface concrete approaches, realistic scopes, trade-offs, and execution paths.
Teams submit a short description of their problem. We anonymize it, gather multiple expert perspectives (including budget ranges and timelines), and return them side-by-side. Like a lightweight, upfront Mini-RFP. No vendor pitches. No obligation.
Listen to the conversations
Many of the insights in this review come from long-form conversations with operators, builders, and AI leaders published on the Just Curious podcast. Full interviews are available on Spotify and Apple Podcasts.