AI implementation doesn't fail because organizations lack capability. It fails because the people making decisions about AI lack the discernment to deploy it well. Readiness is the table stakes. Taste is the multiplier.
The raw materials: data quality, accessibility, format consistency, infrastructure readiness, and documented governance policies. This is the layer every other assessment already covers. It's necessary but it's the floor, not the ceiling.
Organizations claim data readiness because data exists — but it's scattered across 14 systems, three cloud providers, and someone's desktop spreadsheet labeled "master_list_FINAL_v3." When an AI model needs to pull 12 months of customer interactions, it finds six months in Salesforce, four months in a legacy CRM, and two months in email threads nobody migrated.
A mid-market insurance company launched an AI claims processing pilot. Three months in, they discovered 40% of their claims data was trapped in scanned PDFs with no OCR pipeline. The AI could read the structured database records but was blind to nearly half the actual claims history. Cost: $200K+ in wasted implementation spend.
The organization has a 40-page AI governance policy that legal drafted, the board approved, and nobody has read since. Data ownership is defined on paper but not enforced in practice. The CISO signed off on an AI security framework, but the engineering team doesn't know it exists.
A Fortune 500 retailer had a comprehensive data governance policy. When audited, they discovered 60% of their data pipelines had no lineage tracking — they couldn't trace which customer data fed which AI model. A material GDPR/CCPA violation that the governance document was supposed to prevent.
Organizations that treat data like a product — with owners, quality standards, SLAs, and consumers — consistently outperform. They don't just have data; they have data that is maintained for specific use cases.
Spotify's data mesh approach assigns domain teams ownership of their data as a product. When they build AI features like Discover Weekly, the recommendation team doesn't beg the streaming team for clean data — the streaming team already publishes it with documented schemas, freshness guarantees, and quality scores.
Instead of building a massive data infrastructure and hoping AI use cases emerge, successful organizations start with a specific question and work backward to what data they need. This avoids the common trap of spending 18 months on a data lake nobody uses.
Capital One's approach begins with business questions, not data infrastructure. Each AI initiative starts with a clearly defined decision it needs to improve — credit risk scoring, fraud detection, customer targeting — and builds only the data pipeline required for that specific decision.
How the pieces connect. Process design, integration layers, workflow readiness. The question isn't whether you have the technology — it's whether your processes deserve to be automated or need to be redesigned first.
The most expensive mistake in enterprise AI: taking a broken manual process and layering AI on top of it. The AI runs faster, which means it produces bad outputs faster. Organizations confuse speed with improvement.
A large bank automated their loan approval workflow. The existing process had 23 manual handoffs, six of which existed because of a regulatory requirement repealed in 2019 but never removed. The AI automated all 23 steps, including the six unnecessary ones. 40% faster but still fundamentally wasteful — and now harder to change because it was embedded in code.
Systems connected through brittle point-to-point integrations, manual CSV exports, or someone running a script on their laptop every Tuesday morning. When the AI needs real-time data from three sources, it gets stale data from two and nothing from the third because Kevin is on vacation.
A hospital network's AI patient flow prediction relied on bed availability data updated via manual nurse entry that synced every 4 hours. The AI predicted patient flow beautifully against reported data, but reported data was always 2–4 hours behind reality. Technically accurate. Operationally useless.
Organizations that achieve the highest ROI from AI treat implementation as a trigger to redesign workflows first. "If we were building this process from scratch today, what would it look like?" Then they automate the redesigned version.
When Toyota implemented AI-assisted quality inspection, they didn't just point cameras at existing stations. They redesigned the entire quality workflow — moving inspection earlier, consolidating redundant checks, eliminating steps that existed because human inspectors couldn't see certain defects. Defects dropped 50% while removing inspection steps.
Organizations with clean API layers between systems can plug AI in without re-architecting everything. The AI becomes a new consumer of existing, well-documented interfaces.
Stripe's architecture means any new capability — AI fraud detection, smart routing, risk scoring — plugs into the same API infrastructure external customers use. No special integration work needed. This is why they ship AI features in weeks, not quarters.
Not governance in the abstract compliance sense. Actual human accountability. Who owns AI outcomes? Who decides to kill a failing project? Who gets the call when the agent hallucinates at 2am? Most organizations have governance documents but no governance behavior.
AI gets assigned to a Center of Excellence or an AI Council that meets monthly, reviews dashboards, and has no actual decision-making authority. When something goes wrong, there are four people who could theoretically be responsible but none who will actually pick up the phone.
A major retailer's AI pricing engine began recommending prices below cost during a holiday weekend. Engineering assumed business was monitoring. Business assumed engineering had guardrails. The AI team assumed pricing had override authority. Nobody acted for 72 hours. Estimated impact: $2M+ in margin erosion.
Organizations invest months and millions into AI projects and create institutional momentum that makes it politically impossible to stop a failing initiative. Nobody wants to be the person who killed the CEO's pet AI project.
IBM's Watson for Oncology recommended cancer treatments worldwide. Internal documents revealed it sometimes recommended unsafe treatments, but organizational momentum behind "AI-powered cancer care" made it extraordinarily difficult for clinicians to push back. Accountability deferred to the technology's reputation rather than clinical outcomes.
A single named individual — not a team, not a committee — is personally accountable for each AI deployment's outcomes. This person has the authority to pause or kill the deployment without committee approval.
Airbnb assigns a DRI (Directly Responsible Individual) to every AI feature. When pricing suggestions showed signs of discrimination, the DRI had pre-authorized authority to disable the feature within hours, investigate, and re-enable only after the issue was understood and fixed.
Before launching an AI initiative, successful organizations define the specific conditions under which they will stop, pause, or pivot — documented before institutional momentum builds.
Netflix includes pre-registered failure criteria for every model deployment. If a recommendation model's metrics drop below threshold for 48 hours, it automatically rolls back. The decision to stop is made before the emotional investment begins.
Talent, change readiness, and organizational honesty. Can your people actually use these tools? Are they willing to change how they work? And most diagnostically — can your organization be honest about where it actually is versus where it wishes it were?
The organization runs AI literacy training, checks the "upskilled" box, and changes nothing about how work is actually done. Employees attend a 2-hour workshop on prompt engineering, go back to their desks, and continue doing their jobs exactly the same way. Training created awareness but not behavior change.
A Big Four consulting firm gave all consultants access to an internal AI tool for research synthesis. After 6 months, 80% of queries were basic factual lookups — the equivalent of using a Ferrari to drive to the mailbox. The tool could synthesize 50-page reports, but nobody was trained on those capabilities because the rollout focused on "how to log in" rather than "how this changes what your Tuesday looks like."
Leadership asks "are we ready for AI?" and gets optimistic answers because nobody wants to slow down the CEO's vision. Readiness assessments are gamed — teams rate themselves 4/5 on data quality because admitting it's a 2 would mean explaining why they haven't fixed it yet.
McKinsey's research found that while 88% of organizations report using AI, only 39% can point to measurable EBIT impact. That delta represents organizations that told leadership "we're doing AI" but couldn't prove it was working — and nobody asked hard enough.
Organizations that actually transform don't just give people AI tools — they redefine what good performance looks like. The performance review, the workflow, the daily standup, the definition of "done" — all of it shifts.
Klarna eliminated reliance on Salesforce and Workday, replacing significant portions with AI. The key: they simultaneously restructured teams, eliminated middle-management layers, and redefined success metrics. Customer service agents weren't just given an AI assistant — their role was redefined from "resolve this ticket" to "handle the 15% of cases AI can't."
Organizations where leaders are rewarded for identifying gaps — not punished for admitting them — consistently deploy AI more successfully.
GitLab's radically transparent culture means AI readiness gaps get surfaced immediately. When their AI code review tool showed lower accuracy on proprietary Ruby code, the team publicly documented the limitation, proposed a timeline to fix it, and set explicit criteria for when it would be ready. No political cover-up, no inflated metrics.
Strategic judgment — the ability to distinguish between a good AI decision and a merely popular one. The discernment to know when NOT to deploy AI. This is the dimension no other assessment measures, because it can't be self-reported — it has to be revealed through choices.
Every organization has leaders who can evaluate a spreadsheet. Far fewer have leaders who can evaluate a decision. Taste is the difference between "this AI initiative has good metrics" and "this AI initiative is solving the right problem in the right way at the right time."
Taste is not subjective — it's observable through the quality of decisions an organization makes when certainty is low and stakes are high. It's what separates the 4% of organizations that achieve scaled AI deployment from the 96% that don't.
The organization deploys AI because competitors are, because the board is asking, because the CEO saw a demo at Davos. Use case selection is driven by what sounds most impressive in a press release, not by where AI creates the most value.
Numerous enterprise chatbot deployments in 2023–2024 launched because "everyone needs a chatbot." Many handled 5% of queries, frustrated customers on the other 95%, and cost more than the human agents they replaced. Companies with taste invested in back-office document processing — unglamorous, high-ROI, and invisible to the press.
Optimizing for the number that's easy to measure rather than the outcome that actually matters. An AI service bot that resolves 85% of tickets but tanks NPS. A recruiting AI that screens 10x more candidates but introduces subtle bias.
Amazon's internal AI recruiting tool was trained on 10 years of hiring data and got very good at predicting which resumes matched historical hires — which meant it systematically penalized resumes including "women's." The metric (match rate) was excellent. The judgment (training on biased data) was terrible. Amazon killed the project. That kill decision itself was an act of taste.
Deploying AI to 15 use cases simultaneously, spreading resources thin, getting mediocre results everywhere, and achieving excellence nowhere. Taste is also about restraint.
Organizations with taste deploy AI to fewer use cases but deploy it excellently. They choose the use case where AI creates disproportionate leverage.
John Deere didn't try to make AI do everything on a farm. They focused on one problem: identifying and spraying only the weeds, not the entire field. See & Spray reduced herbicide use by 77%. The taste wasn't in the technology — it was in the restraint.
The highest expression of AI taste is recognizing when AI is the wrong solution. When a simpler rule-based system, a process redesign, or even a well-structured spreadsheet solves the problem better.
Basecamp (37signals) has been vocal about NOT deploying AI where simpler solutions work. If a rule-based system solves 95% of cases, don't build a machine learning model to get to 97%. The additional 2% rarely justifies the complexity. This is taste — knowing where sophistication creates value and where it creates cost.
Not just asking "will this AI work?" but "what happens after it works?" If the AI bot handles 80% of queries, what happens to the remaining 20%? Are those the hardest, highest-stakes interactions — meaning your human agents now handle only angry, complex cases all day?
When Shopify deployed AI for merchant support, they explicitly designed for the second-order effect. They knew AI would handle routine queries, leaving humans with harder cases. So they simultaneously restructured the human support role — different title, higher pay, different training. They anticipated that "AI handles the easy stuff" would change the human job, and they designed for that change proactively.
Taste can't be self-reported. Nobody says "my AI judgment is poor." So instead, it's tested through scenario-based choices where there's no obviously correct answer. The pattern across multiple scenarios reveals whether you default to speed, safety, sophistication, or inertia.
Your AI pilot shows 78% accuracy on a task that humans do at 85%. What's your move?
What it reveals: A is premature. B is safe but incurious. C shows analytical depth. D shows strategic sophistication. The best answer depends on context — which is why the follow-up "why?" matters more than the choice itself.
Your CEO returns from a conference excited about deploying an LLM for internal knowledge management. Your knowledge base is 60% outdated Confluence pages, SharePoint files, and tribal knowledge. First move?
What it reveals: A is hype-driven. B is disciplined but assumes the solution. C tests assumptions empirically. D reframes entirely — the CEO might be solving a problem with a simpler answer.
18 months and $400K in. Mixed results. Passionate team. Invested executive sponsor. Growing usage but unclear ROI. Your call?
What it reveals: A is sunk cost avoidance disguised as patience. B shows creative problem-solving. C shows courage. D shows governance maturity. The pattern across scenarios like this reveals whether the organization defaults to momentum or judgment.
Your team proposes an AI agent that autonomously processes customer refunds up to $500. 96% accurate in testing. What's your primary concern?
What it reveals: A is risk management (important but table stakes). B is brand awareness. C is edge-case thinking — the hallmark of implementation maturity. D is first-principles thinking. Taste is revealed in what someone worries about first.
| Feature | Traditional Assessment | Jewell Assessment |
|---|---|---|
| Format | Static questionnaire | Adaptive conversation |
| Measures | Capability (can you?) | Capability + Judgment (should you?) |
| "I don't know" | Skip / N/A | Diagnostic signal |
| Output | Score + checklist | Constraint diagnosis + taste signature |
| Accountability | Measures governance docs | Measures governance behavior |
| Taste | Doesn't exist | Revealed through scenario choices |
| Time to value | 6–10 week engagement | 10 minutes to first insight |
| Cost | $50K – $175K consulting | Free |
| Bias | Built by vendors selling implementation | Built by an independent practitioner |
5–12 minutes. No login. Immediate results with a constraint diagnosis, taste signature, and three prioritized actions.
Start the Assessment