Home » K-Means made easy with BigQuery ML. Create automatic clusters in seconds.

K-Means made easy with BigQuery ML. Create automatic clusters in seconds.

Imagine you have a huge box of Lego bricks in front of you…

different shapes, sizes and colors, all mixed together.

How would you start sorting it to understand what you really have? You’d probably make little piles, right? One pile for the large red bricks, another for the small wheels, another for the figures… Intuitively, you’d be grouping similar pieces together.

 

Well, in the world of marketing data and web  netherlands phone number data analytics, we often find ourselves faced with a similar “Lego box of data”: hundreds of landing pages, thousands of keywords, millions of user interactions … Invaluable data, but difficult to grasp if we look at it one by one. Wouldn’t it be great to have an automatic way to create these “heaps” to discover groups of pages with similar performance, user segments with similar behaviors, or keywords that perform anomalously ?

One option for creating these groups is to create manual groupings by defining the rules for assigning data to one group or another: This way, you can  or create a histogram separating metrics by 100. But this requires a level of knowledge and simplicity of the data that we won’t always have. Other times, we simply want a few groups and have no idea what criteria a record would use to enter one group or another.

This is where K-Means comes in , a powerful  building trust in the modern era clustering technique. And best of all, thanks to BigQuery ML , you can apply it directly to your data using SQL queries, without needing to be a data scientist or leave your BigQuery environment.

For example, we can use K-Means to extract key insights from GA4, Google Search Console, crawlers, and our clients’ databases. With this technique, we have the potential to:

 

  • Automatically segment pages or products by their job data performance (traffic, conversion, revenue, etc.).
  • Detect users with specific purchasing or browsing patterns.
  • Identify keywords or campaigns with unusual results (for better or worse!).
  • Group content by semantic similarity (although we will cover this better in future posts about embeddings).
  • Create new dimensions or variables that enrich our analysis.

In this post, we’ll guide you step-by-step through what K-Means is, how it works in BigQuery ML, and how you can start using it today to organize your data and make smarter decisions.

Let’s get to it!

 

Scroll to Top