AI Tools

Shopify and AI: Shopify Magic + Sidekick — An E-Commerce Owner's 90-Day Field Test

Shopify and AI: Shopify Magic + Sidekick — An E-Commerce Owner's 90-Day Field Test
Contents

I asked Sidekick to summarize last week's abandoned-cart cohort, and it told me the recovery rate was 23.4%. The dashboard said 17.2%. I tried three more times. Got three different numbers. Eventually I rebuilt the query in ShopifyQL (Shopify's native SQL-style query language) myself and got 17.4%, which matched.

That was minute three of a 90-day test of Shopify Magic and Sidekick on a real store — about 50 SKUs (stock-keeping units), ~3,000 orders a month, a tiny team. I went in skeptical. I came out with a checklist of what to keep, what to ignore, and one ugly truth about AI in e-commerce: it does about 80% of the writing well, and the 20% it gets wrong is the part that makes a store sound like itself.

What I tried in the first month

I turned on Magic's product description generator for the new arrivals. The interface is a sparkle icon next to the description box. You give it a few keywords, a tone ("expert" / "supportive" / "playful" / "inspired" / "bold"), and it spits out a draft in about four seconds. I tested it on 12 products in a row: ceramics, a hoodie, a coffee subscription, three SKUs of dog treats. Eight of the twelve were usable on the first pass, with light edits. Two were generic to the point of being useless ("This product is great for many occasions"). Two had me rewinding to confirm I'd actually written "hand-thrown stoneware" in the keyword list — because the output didn't mention the kiln process at all.

The blog post tool was better than I expected for top-of-funnel (TOFU, the awareness stage) SEO posts, and worse for anything with a specific point of view. The FAQ generator saved the most time — I'd been putting off rewriting 47 product-page FAQs for nine months, and Sidekick did them in an evening. About 60% needed human polish, but the structure was right.

Email subject lines were the headline win. I'm not going to pretend I ran a clean A/B test (A/B testing means sending two versions to similar groups and comparing results to see which performs better). I sent the AI-generated subject lines to a ~10,000-subscriber segment, switching back to my own lines for the next send. Across four matched sends, the AI-drafted lines opened at 38.2%, 41.7%, 36.9%, and 40.4%. My own lines on the same kind of content opened at 32.1%, 34.0%, 29.5%, and 33.8%. That's a real lift of 4–7 percentage points on opens, and CTR (click-through rate, the share of recipients who clicked a link) moved up roughly proportionally. I am not attributing this entirely to AI — the lines were also shorter, which is a confound I couldn't control for — but the pattern held across all four sends.

What broke

Sidekick hallucinated. Not the way ChatGPT hallucinates in a chat, where you can spot it. Sidekick hallucinated in charts, which is more dangerous. It invented a "predicted revenue" chart on my dashboard that didn't match anything in the data layer. It confidently told me my best-selling product last quarter was a SKU that hadn't existed until 30 days prior. When I asked it to draft a customer segment ("everyone who bought a gift in November and hasn't returned since"), it built the segment, but when I checked the count against the actual orders, it was off by 11%.

The product description tool also has a quiet failure mode. For technical products — anything with real specs, like a French press with a stated mesh size or a serum with a real ingredient list — it invents plausible-sounding details that are not in your product data. "Made with sustainably sourced cork" appeared in a draft for a product that, in fact, used a synthetic base. I caught it because I check. Most store owners won't catch it on every SKU. That's the risk: Magic doesn't know what's true about your product. It knows what sounds true.

The bigger problem is brand voice. The "expert" tone is its strongest default. For my store, where the voice is closer to "a friend who happens to know ceramics," the expert output was unusable on the first sentence. I had to write the opening 40–60 words of every product page myself, then hand the rest to Magic with a one-line instruction: "Match the tone of the first paragraph." It worked, but it meant AI wasn't replacing my writing — it was drafting the body while I wrote the soul.

What I changed

Three changes moved the needle.

First, I stopped asking Magic to write whole product pages. I write the opening paragraph (the brand-voice part) and ask Magic to do the features-and-benefits middle section. That's where it shines: mechanical, scannable, structured. The opening and the closing CTA (call to action — the prompt telling the reader what to do next, e.g. "Add to cart") stay human.

Second, I use Sidekick for analytics queries that I would otherwise build in ShopifyQL. Specifically, "show me last month's refund rate by product line" or "what's the repeat-purchase rate for customers acquired in March?" — short, well-defined questions. When the question is open-ended ("why did revenue drop on Tuesday"), Sidekick gives me a confident, plausible, often wrong story. I learned to verify every numeric claim against the dashboard.

Third, I treat the AI image tools as a starting point, not a deliverable. The background remover is fine. The "AI lifestyle scene" generator makes product photos that look like every other Shopify store in 2026, because they all used the same generator. I went back to studio shots for hero images.

90-day results

  • Product description time per new SKU: ~45 minutes → ~18 minutes, mostly because the middle section is now auto-drafted.
  • Email open rate on campaigns using AI-drafted subject lines: +5.4 points on average across 11 sends.
  • Time spent rewriting FAQs: dropped from "nine months of dread" to one evening.
  • The 30 hours a month I expected to get back: I got about 12. The missing 18 went into reviewing, fact-checking, and rewriting the openings Magic couldn't handle.

The one thing I'd tell a peer

If you run a small e-commerce store and you're thinking about turning on Magic and Sidekick: turn them on, but protect the parts of your store that make it yours. Use Magic for the body of product pages, not the soul. Use Sidekick for short, specific analytics questions, and treat any chart or number it produces as a draft, not a fact. Don't trust its product descriptions on anything technical. And for the love of your open rate, let it write your subject lines — that part actually works, and it's the highest-ROI (return on investment — the ratio of output gained to effort spent) use of the entire stack.

The AI did not give me 30 hours back. It gave me 12, and it gave me a slightly better open rate, and it gave me a faster way to do the work I already knew how to do. That's the honest report. The other 80% of what the demo videos show, you still have to do yourself.