top of page

Report: The "Visibility Trap" Beware of AI Ranks for Selecting AI Tools (and maybe software and B2B services in general)

Executive Summary: This report examines a critical flaw in using Large Language Models (LLMs) as the primary engine for software procurement. Using Traq.ai as a primary case study, the sources demonstrate that AI rankings prioritize market visibility, distribution, and SEO over functional superiority and/or the usefulness of the features each product offers. Relying on an initial AI query often results in a list of "popular" tools rather than the "best" tools for a specific workflow.



The story behind it: At Navigamo, we began using Traq.ai (an IA tool designed specifically for sales teams) when they became our clients, as we believe that the best way to truly understand a product is to use it if you are part of its natural target. At first, only the people involved in sales were using it but it “trickled down” to all of us, even if we are not directly involved in sales. Since then, we have tried other products that are not necessarily sales-oriented but good, simple note-takers (Fathom being one of them) and nothing that we saw or tried, even compared. We ended up keeping Traq.ai for our team (and we pay for it) so I know firsthand how phenomenal it is.  When one day I asked CHATGPT about the top 10 AI tools for sales, to my surprise, Fathom (not really a sales tool) was there,, and Traq no. My first thought was “of course, is ChatGPT” (based on my experience, I considered ChatGPT the weakest one, quality-wise, compared to Gemini, Claude and Copilot, except if what you want the most from an LLM is to be told that you always ask “oh, so smart questions!”...) Nevertheless, it piqued my interest to understand why Traq was absent. After challenging ChatGPT about some of its answers and after asking it to focus on the original request (10 best sales tools, not meeting assistants) Traq ended up ranking third after Gong and Chorus, with others like Fireflies and Otter.ai in 5th and 6th place. 

While I never expected an LLM to give me good advice on AI tools, this experience was so interesting that I decided to repeat the experiment with some of the other big LLMs (I went with Gemini, Claude/Sonnet 4.5, and Copilot).  In general, the cycle was the same:

  1. Started with a relatively general question (but already establishing that what I was asking about was the best tools specifically built for Sales teams) 

  2. Normally got a list that didn’t include Traq.ai

  3. Then we asked what the LLM knew about Traq. The LLM's answers almost always ended with a question along the lines of: “Would you like me to create that vendor evaluation framework now and review the previous rank, or do you have specific vendors you're comparing Traq.ai against?”

  4. If we now ask to review the rank, the worst that Traq did was in the 6th place (for Claude) still over Fathom and Salesforce.

  5. When the LLM was asked why Traq was not on the first list, the “excuses” centered on popularity, and, in general, there was acknowledgment that it should have been there.

The whole experience was quite eye-opening and revealed some of the pitfalls of IA if you take their initial answers as "good enough" ones. They are not, and below you can find some of the reasons, as well as a simple methodology to get to much better results.


So why do the first results tend to be skewed and not great? In this case, why was Traq.ai not included in the initial lists, while other tools that were not even part of the category we asked about were included?


1. The Training Data Gap: If You Aren’t "Big," You Don’t Exist

One of the primary reasons a specialized tool like Traq may be omitted from initial AI lists is a lack of representation in the LLM’s static training data.

  • Information Lag: LLM training data is not exhaustive and often has a cutoff date. For example, Claude (Sonnet 4.5) explicitly admitted that Traq.ai was not in its core training data or was mentioned too briefly to provide a confident answer and that its last training ended almost one year ago.

  • The Size Bias: AI models favor "category-defining" platforms such as Gong or Salesforce because they receive extensive media coverage, reviews, and documentation. Smaller or newer companies (even those founded as early as 2021) may not have reached the "critical mass" of public content required to be considered by an AI in a generic search.

  • The Catch-22: Smaller innovative companies need visibility to grow, but they need to be large already to be "named" by an AI, creating a circle that may take months or years to break if the models don't change how they process the information out there.


2. Category Dilution: Mixing "Note-Takers" with "Sales Platforms"

AI models often fail to distinguish between a general-purpose utility and a niche-specific solution, leading to misleading rankings.

  • Generalist Bias: Tools like Fathom frequently appear on "Top 10" lists because they are universally adopted by recruiters, founders, and non-sales teams.

  • The "Overkill" Logic: AI may exclude specialized tools because it deems them "overkill" for a general user, even if the user’s query was specifically about sales. For example, Fathom is characterized as an "AI note-taker that happens to work for sales," whereas Traq is a "Sales coaching platform that happens to record meetings".

  • Functional Superiority vs. Adoption: When pressed, AI acknowledges that for sales-specific needs like buyer sentiment, objection identification, and deal scoring, Traq outperforms general tools that rank higher on initial lists.



3. SEO and Distribution Bias


AI-generated lists are often reflections of the most "SEO-optimized" content on the web rather than objective feature audits.

  • SEO Echo Chambers: AI often pulls from articles with titles like "11 Best AI B2B Sales Tools," which are frequently written by other AIs or designed purely for search engine rankings.

  • Marketing Noise: Platforms with massive funding and "word-of-mouth" presence (like Gong or Fireflies) dominate the "marketing noise" that feeds LLMs. A tool like Traq, which may offer more actionable sales insights at a fraction of the cost, is often buried because it has fewer review volumes and less marketing saturation.


4. The "Pressing" Effect: Why the First Answer is Often Wrong

The sources highlight a consistent pattern: the more you question the AI, the more accurate the ranking/answer becomes.

ChatGPT, when pressed on the quality of its first answer.
ChatGPT, when pressed on the quality of its first answer.


  • Contextual Re-ranking: Once the AI is forced to compare tools side-by-side or is given a specific sales-coaching rubric, Traq moves from being omitted to being ranked as a "Top 3" or "Top 6" contender.


  • Hidden Gems: When prompted to look for "under the radar" tools, AIs like Copilot and Claude suddenly recognize Traq as a "rising star" for SMBs and mid-market teams, offering better value for rep development than the enterprise giants.



Conclusion - A Better Strategy for Selection? Ask, Refine, Challenge, Refine Again.

Based on our research, using AI as a starting point for tool selection creates a bias toward expensive, enterprise-grade heavyweights (in our example, Gong) or basic, lightweight tools that may not be what we asked for (in this case, a note-takers, Fathom, instead of a fully sales-oriented tool), while missing the mid-market specialized tools (Traq) that often can provide the best ROI.

Recommendations:

  • Avoid Generic Queries: When possible, do not ask for generalizations like "The Best 10 Sales AI Tools." Instead, request tools that specialize in specific features such as "buyer sentiment scores" or "proprietary call scoring".

  • Use your first search to discover the features… and ask again: If, for your first search, you are not sure what is out there on the market and what specific features you may want, go for the generic query, but then request a list of features from that list. Tempting as it may be, avoid selecting brands at this step; instead, ask what features these brands share and which ones are unique. Now, identify from that list which features you would like in your ideal tool and run a new query, asking for tools with XX and YY features.

  • Verify Category Fit: Ensure the AI does not mix categories (e.g., "meeting assistants" with "revenue intelligence" platforms), as this dilutes the quality of the features being evaluated.

  • Challenge the Omission: If a known tool or one you are considering is missing from the list of “best”, do not assume the others are better. Instead, ask the AI to compare it specifically with the "top" tools. This forces the AI to move beyond its popularity-biased training data and perform a real-time search. 


In Summary: Focus Is Key

Don't let AI narrow your vision to "popular" tools, services, or software over effective ones. Make sure you poke and prod the AI you are using; spend 15' or 20' more pushing it toward focusing on things like features or ease of implementation, and you may get much better results.

As in high school, the most popular students may be great kids that everyone knows and wants to be friends with, but they may not be the ones you need to join the chess or dance team if your goal is to win competitions.


If you are interested in having an analysis like this for your brand, contact us at hello@navigamo.co with the subject line: AI visibility


Author's Picks

bottom of page