2
The Foundation of Order and Atomic Values 1:02 Miles: You know, Lena, before we get into the heavy lifting of Power Query, we really need to anchor ourselves in what 1NF actually demands. It’s funny—people hear "Normalization" and they think of some high-level academic theory from the seventies. And yeah, Edgar F. Codd did give us these rules in the early 1970s, but they are incredibly practical. If your data isn't in First Normal Form, it isn't really a "table" in the formal sense—it’s just a grid of characters.
1:33 Lena: That’s a bold statement! "It isn't really a table." I think a lot of our listeners might look at their current trackers and realize they’ve just got a digital legal pad. So, if we’re building this foundation, where do we start? I keep hearing the term "atomic values." It sounds like science class.
1:49 Miles: It does, doesn't it? But in the world of data, an atomic value just means something that can’t be split any further without losing its meaning for your current task. Think about a cell that says "Red, Blue, Green." That’s a list, not an atomic value. If you want to count how many "Red" items you have, Excel’s standard count functions are going to struggle because they’re looking for a cell that *is* "Red," not a cell that *contains* it buried among other words.
2:17 Lena: Right, and I’ve seen this happen with names all the time. Someone puts "John Smith" in one cell. Now, if you just want to print a mailing label, maybe that’s fine. But what if you want to sort by last name? Or what if you want to send an email that starts with "Dear John" instead of "Dear John Smith"? Suddenly, that single cell is a problem.
2:34 Miles: Exactly! One of the sources I was looking at makes a great point about this—even something like a full name or a multi-part identification number can violate 1NF because it's not indivisible. If you have a serial number like "COMP-2026-NYC," and those parts represent a product category, a year, and a location, you should probably have three separate columns. Otherwise, you’re forcing yourself to use complex `LEFT`, `MID`, and `RIGHT` formulas every time you want to analyze data by year.
3:04 Lena: It’s like we’re making the computer do extra homework because we were too lazy to set up the columns correctly at the start. So, the first big rule is: each row-by-column value—what we call a cell—must contain exactly one value. No lists, no comma-separated strings, no "compound" values.
3:22 Miles: Precisely. And that leads directly into the second big pillar: no repeating groups. This is where most Excel users "commit spreadsheet," as one of our sources calls it. Imagine you’re tracking projects and you have columns for "Employee 1," "Employee 2," and "Employee 3."
3:39 Lena: Oh, I’ve done that! It seems so logical at the time. You think, "Well, a project usually has three people, so I'll just make three columns." But then what happens when a fourth person joins the team?
3:49 Miles: That’s the trap! You have to go in and manually alter the structure of your entire table. You’re adding a column, which might break your formulas or your pivot tables. In 1NF, you don't grow horizontally; you grow vertically. Instead of "Employee 1, 2, 3" as columns, you’d have multiple rows for the same project, each with a single employee name.
4:11 Lena: I can hear the objection already, Miles. "But then I’m repeating the Project Name over and over! Isn't that redundant?"
4:18 Miles: It feels redundant to the human eye, for sure. We like to see things once and then blank space below it. But for a computer, that repetition is actually "completeness." Rule number three is that every record must be complete unto itself. It shouldn't rely on the row above it for information. No merged cells! Merged cells are the absolute enemy of 1NF because they break the rule that the order of rows shouldn't matter.
4:44 Lena: That makes sense. If I sort a table with merged cells, everything breaks. The data becomes unmoored from its context. So, to recap our foundation: one value per cell, no repeating columns like "Phone 1" and "Phone 2," and every row needs to be a unique, self-contained record.
5:01 Miles: Spot on. And that uniqueness is key. You need a primary key—a column or a combination of columns where the value is never duplicated and never empty. In a simple list of students, that might be a Student ID. If you don't have a unique identifier for every row, you don't have a way to reliably reference that data later.
5:23 Lena: It’s fascinating because once you follow these rules, you realize you aren't just "cleaning" data—you’re actually designing a system. It’s the difference between a pile of clothes on the floor and a wardrobe where everything has its own hanger.
5:36 Miles: I love that analogy. And just like a wardrobe, once it's organized, finding that one specific shirt—or in our case, that one specific data point—becomes effortless.