Episode Transcript
Last week, I found Heinrich Müller's missing brother. Jacob Müller, a will-right who immigrated from Bavaria in 1909. Family stories said he came to America to work near his older brother Heinrich. But where was he? How do you find one common-named German immigrant in a sea of Müllers and Müllers? Turns out, he was living five houses away from Heinrich the entire time. Right there, on the same 1910 census page. Hidden in plain sight for 115 years. I found him using something called the fan club method. Friends, associates, and neighbors. It's one of the most powerful techniques in genealogy for breaking through brick walls. And, when you combine it with free AI tools, you can extract entire census neighborhoods into spreadsheets in about 10 minutes. Today, I'm walking you through exactly what I did. How I extracted 67 people from Heinrich's neighborhood. How I spotted the patterns that revealed Jacob. And, how you can use this same technique for your own ancestors. Let's find some missing family members. Welcome to Ancestors and Algorithms, where family history meets artificial intelligence. I'm your host, Brian. And, for the last year and over 1,000 hours, I've been using AI tools every day to break through genealogy brick walls. This episode is a direct continuation of episode 15. If you haven't listened to that one yet, I'd recommend going back. We spent that entire episode tracing Heinrich Mueller from his 1923 Pennsylvania death certificate All the way back to his village in Bavaria. But, after all that research, I still had a problem. Family stories said Heinrich had a younger brother named Jacob who also came to America. After finding Heinrich's naturalization papers, his Ellis Island record, even his German church baptism, I still had no idea where Jacob was. Last week, I figured it out. And, the answer was simple. Jacob was Heinrich's neighbor. Five houses away on the 1910 census. The technique I used to find him is called fan club research. Friends, associates, and neighbors. Professional genealogists have used this method for decades. But, combining it with AI tools makes it faster and more powerful than ever before. Here's what we're covering today. First, what the fan club method is and why it works for breaking brick walls. Second, how I used free AI tools to extract an entire census neighborhood into a spreadsheet. Third, the patterns I found. Surname clustering, birthplace analysis, and chain migration clues. Fourth, how I verified Jacob was actually Heinrich's brother. And, fifth, how you can use this exact technique for your own research. By the end, you'll have a copy-paste ready prompt, a clear understanding of CSV files, and a completely new way of looking at census records. Let's dive in. So, here's where I was after episode 15. Heinrich Mueller's story was fully documented. I had his death certificate, his naturalization papers, his Ellis Island arrival, even his church baptism record from Stuttgart. Beautiful, complete research. But, the family story kept nagging at me. Heinrich's great-grandchildren remembered hearing about Uncle Jacob, a younger brother who was a wheelwright. The story said Jacob immigrated a few years after Heinrich to work near him, since blacksmiths and wheelwrights often worked together. "So, I tried searching for Jacob Mueller in Pennsylvania, hundreds of results." "Jacob Miller, the Anglicized version, even more results." "Jack Mueller," pages and pages. "I tried limiting my birth year, but I didn't know when Jacob was born." "Maybe 1875, 1880, 1885." "That's a 10-year range, still drowning in results." "I tried filtering by occupation, wheelwright," "but that field is often blank or misindexed in database searches." "This is the classic common name brick wall, and it stops researchers cold all the time." "But here's what I did know: I knew exactly where Heinrich was living in 1910. Ward 15, Pittsburgh, Pennsylvania. Dwelling 201. The census told me everything." "His age." "His wife, Carolina." "Their two children." "His occupation as a blacksmith." "And I knew something else." "Something genealogists have understood for centuries." "People didn't immigrate alone." "In the early 1900s, immigration was terrifying." You're leaving everything, your language, your home, your family, to go to a country where you might not know anyone. So, people moved in groups. Brothers followed brothers. Whole villages in Bavaria had cousins in the same Pennsylvania town. This pattern is called chain migration, and it means if Heinrich was in Pittsburgh in 1910, there was a good chance Jacob was nearby. This is where the FAN club principle comes in. Elizabeth Sean Mills, one of the most respected genealogists in America, coined this term. FAN stands for Friends, Associates, and Neighbors. The idea is simple but powerful. When you can't find direct evidence about your ancestor, look at the people around them. Your ancestor's neighbors can tell you where they came from. If the whole street is from Bavaria, your ancestor probably is too. Who they're related to. That woman next door with a different last name might be a married sister. What community they belong to. German speakers probably attended the same church. Even maiden names. Unmarried women often lived near their parents. Traditionally, doing this kind of research was tedious. You'd pull up a census page and manually transcribe every single person in the surrounding households. Name. Age. Birthplace. Occupation. Immigration year. It could take an hour just to do one census page. But here's where AI changes everything. AI can't tell you who Jacob is. can't prove relationships. But it can extract all 67 people from a census page into an organized spreadsheet in about two minutes. And then we use our genealogist brains to spot the patterns. That's what I did last week. I took Heinrich's 1910 census page. Used a free AI tool to attract every person in his neighborhood and started looking for patterns. And there was Jacob. Dwelling 203. Just five houses away. Age 31. Birthplace Bavaria. Immigrated 1909. Occupation. Wheelwright. He'd been there the whole time. I just needed to look at the whole neighborhood instead of just Heinrich. Now, let me show you exactly how I did it. Before I show you the extraction process, we need to talk for a minute about how we organize this data. Specifically, something called a CSV file. I know, some of you just went, oh no, not technical stuff. Stay with me. This is actually really simple and understanding it makes you way more powerful when working with AI. CSV stands for comma separated values. Comma separated values. Imagine you have a list of people with their names and ages. The simplest way to write it would be John comma 45, Mary comma 38, Sarah comma 62. See those commas? That's literally what CSV means. You separate each piece of information with a comma. Why does this matter for genealogy? Because CSV is the universal language that spreadsheet programs understand. Google Sheets opens CSV files. Microsoft Excel opens CSV files. Apple Numbers opens CSV files. It's like the Esperanto of data. Every program speaks it. When you upload a CSV file to Google Sheets, it automatically turns those commas into nice, neat columns. John goes in the name column. 45 goes in the age column. Perfect organization. Now, here's the brilliant thing. We can ask AI to create a CSV file from a census image. We say, quote, look at this handwritten 1910 census page and give me all the information in CSV format, end quote. And it will read that messy penmanship and turn it into clean, organized data. For my Heinrich Mueller project, I asked AI to create a CSV with these columns. Dwelling. House number from the census. Family. Family number, sometimes multiple families per house. Last name. Surname as written. First name. Given name as written. Relationship. Relationship. Relationship to head of household. Age. Age on the census day. Sex. Male or female. Race. As recorded on the census. Birthplace. Usually a state or country. Father birthplace. Father's birthplace. Mother birthplace. Mother's birthplace. Immigration year. Year they came to America. Naturalized. Citizenship status. Occupation. Their job. Can read. Literacy indicator. Can write. Literacy indicator. That's a lot of information. Normally, transcribing all that from one census page would take an hour. But last week, using Gemini 2.5 Flash, Google's free AI model, I got all 67 people extracted in about 90 seconds. Now, AI isn't perfect. Sometimes, it misreads handwriting. Sometimes, it gets an age wrong by a year or two. That's okay. Remember our golden rule. AI is your research assistant, not your researcher. AI got me 95% of the way there instantly. But I still verified the important stuff. Ages for Heinrich and Jacob, immigration years, birthplaces. I checked their work. But even with verification, the whole process took 10 minutes instead of an hour. Alright, here's what I did last week to extract Heinrich's census neighborhood. Step 1. Getting the census image. First, I needed the actual census page. I have an ancestor subscription, so I found Heinrich in the 1910 census. Pittsburgh, Pennsylvania, Org 15, Enumeration, District 155, Sheet 12A, Dwelling 201. I clicked view image to see the actual handwritten census page. Then clicked download image to save it as a JPEG file to my computer. If you don't have an ancestor subscription, you can access many census records free on FamilySearch. You'd find the census image there and take a screenshot. Same result. For my UK listeners, the same technique works with UK census records from 1841 to 1921. In Australia, you can use this with New South Wales census records or other state census collections. Step 2. Choosing the AI tool. I tested three different free AI tools for this project. CLOD, very accurate with handwriting, especially older records. Gemini 2.5 Flash, good accuracy, can create Google Sheets directly. ChatGPT, decent for clear handwriting, struggles with messy writing. I chose Gemini 2.5 Flash for this project because it can create Google Sheets automatically. That meant I didn't have to copy and paste the CSV data manually. Step 3. Writing the prompt. Here's the exact prompt I used. This is word for word, copy-paste ready. Quote, I am analyzing a 1910 U. S. federal census page for genealogy research using the fan club method. Parentheses, friends, associates, neighbors, in parentheses. Please carefully examine this census image and extract all individuals shown into a CSV format with the following columns. Dwelling, family, last name, first name, relationship, age, sex, race, birthplace, father birthplace, mother birthplace, immigration year, naturalized, occupation, can read, can write. For each person on this page, use exact spelling as written, even if it seems incorrect. If a field is blank or illegible, use unknown. For birthplace, record exactly as shown, e. g. Germany or Bavaria or Pennsylvania. Pay special attention to immigration years and naturalization status. Include everyone on the page, not just heads of households. After creating the CSV, please create a Google sheet with this data and share the link with me. I will verify all information, but I need you to help me quickly extract this data so I can analyze patterns in this neighborhood. Notice what I did there? Number one, I told it what I'm trying to do, the fan club research. Number two, I gave it the exact format I wanted, CSV, with specific columns. Number three, I told it how to handle problems, what to do with the blank fields. Number four, I emphasized what matters most, immigration years and naturalization. And number five, I reminded it that I'm the researcher. I will verify all information. This is good prompt writing. You're giving context, being specific, setting expectations. Step four, running the extraction. I went to gemini. google.com and made sure I was signed in with my Google account. The free version, Gemini 2.5 Flash, loaded automatically. I clicked the plus sign, uploaded my census image, pasted my prompt, and clicked Send. Gemini took about 45 seconds to analyze the image. Then, it gave me the CSV data right in the response. I could see it had extracted. Dwelling, 197, Family, 220, Schmidt, Johann, Head, 42, Male, White, Bavaria. Dwelling, 198, Family, 221, Weber, Frederick, Head, 38, Male, White, Germany. Dwelling, 201, Family, 224, Mueller, Heinrich, Head, 36, Male, White, Bavaria. And then I saw it. Dwelling, 203, Family, 226, Mueller, Jacob, Head, 31, Male, White, Bavaria. Bavaria, Bavaria, Bavaria, 1909, Alien, Wheelwright. Jacob Mueller, five dwellings away from Heinrich. Born in Bavaria, immigrated, 1909, working as a Wheelwright, exactly what the family story said. I found him in the first extraction, but finding him was just the beginning. I needed to analyze the whole neighborhood to understand the patterns and build the case that this was actually Heinrich's brother. Step five, creating the spreadsheet. Jim and I offered to create a Google Sheet for me, which I accepted. It generated a link to a spreadsheet with all the data already organized in columns. I clicked File, Make a Copy, so I had my own version I could edit and annotate. Now, I had a beautiful spreadsheet with 23 households, 67 individuals, all extracted and organized. Step six, verification. Here's the critical didn't just trust the AI output blindly. I went back to the original census image and spot checked. All the Mueller family entries, Heinrich, Jacob, and one more Mueller. Immigration years for German immigrants. Ages for key individuals. Birthplaces for Bavaria-born residents. I found two small errors. One person's age was transcribed as 45 when it was actually 48, handwriting was smudged. And one occupation was listed as laborer when it was actually laborer, abbreviation confusion. I fixed those in my spreadsheet. Total verification time, about five minutes. This is important. AI got me 95% accurate data in 90 seconds. I spent five minutes fixing the 5% that was wrong. Total time, under 10 minutes for what would have taken an hour of manual transcription. That's the power of AI as your research assistant. Now, with all this data organized in a spreadsheet, I could start looking for patterns. With 67 people from Heinrich's neighborhood in a spreadsheet, I could finally play detective. I was looking for three specific patterns that would tell me about this community and help me verify that Jacob was actually Heinrich's brother. Pattern number one, surname clustering. First thing I did was sort the spreadsheet by last name. In Google Sheets, I clicked the column header for last name, went to the data menu, and selected sort sheet by column. Here's what I found. Mueller, three families. Dwelling 201, Heinrich Mueller, age 36, immigrated 1907. Dwelling 203, Jacob Mueller, age 31, immigrated 1909. Dwelling 205, Peter Mueller, age 52, immigrated 1898. Schmidt, two families, both from Bavaria. Weber, two families, both from Germany. Koch, one family from Bavaria. Fisher, one family from Baden. Three Mueller families on one census page? All from Bavaria? All German speaking? Peter arrived first in 1898. Heinrich arrived in 1907. Jacob arrived in 1909. What are the chances Peter Mueller is a relative? Uncle? Older cousin? Someone who came first and wrote back saying, Come to Pittsburgh, there's work. I didn't know yet. But, it was definitely a lead. Pattern number two. Birthplace clustering. Next, I used Google Sheets filter feature. I clicked data, create a filter, then clicked the drop down on the birthplace column and selected only Bavaria. 18 people out of 67 individuals on the census page. 18 were born in Bavaria. That's more than a quarter of the neighborhood. And look at where they lived. Dwelling 197, Bavaria. Dwelling 199, Bavaria. Dwelling 200, Bavaria. Dwelling 201, Heinrich, Bavaria. Dwelling 203, Jacob, Bavaria. Dwelling 205, Peter Mueller, Bavaria. Dwelling 207, Bavaria. They were clustered together. This wasn't just a German neighborhood. This was a Bavarian neighborhood. Probably people from the same region who all knew each other back home. Remember in episode 15, I traced Heinrich back to Stuttgart in Bavaria. I would bet money some of these other families were from the same village or nearby villages. Pattern number three. Immigration timeline, the wave. The third pattern was the most revealing. I sorted by immigration year. From earliest to latest. Here's what I saw. 1898, Peter Mueller. 1902, Frederick Weaver. 1903, Johann Schmidt. 1904 to 1906, five more families arrived. 1907, Heinrich Mueller. 1908, two more families. 1909, Jacob Mueller. 1910, one family just arrived. This is chain migration in action. Peter Mueller comes first in 1898. He gets established, finds work, maybe starts a business. He writes home, America is good. There's work. Come join me. More families arrived through the early 1900s. Word spreads, the community grows. Heinrich arrives in 1907, probably with Peter Mueller's address in his pocket. Heinrich writes to his brother Jacob, come to Pennsylvania, I'm working as a blacksmith. There's a wheelwright shop down the street. They need good workers. Jacob arrives in 1909 and settles five houses away. That's the story the census is telling. Not through one name, but through the pattern. This is fan club research at its best. So, at this point, I had strong circumstantial evidence. Two Mueller men, both from Bavaria, five-year age difference, Heinrich 36, Jacob 31. Both emigrated early 1900s, two years apart. Complementary occupations, blacksmith and wheelwright, living five houses apart in a Bavarian neighborhood, part of a chain migration pattern. But I didn't proof yet. Circumstantial evidence needs to be verified. That's when I turned to AI again. Not to do research for me, but to help me plan the next steps. Alright, let's pause for a second and recap where we are, especially if you're just joining us. I'm talking about the fan club method today. Friends, associates, and neighbors. Which is one of the most powerful techniques for breaking through genealogy brick walls. Last week, I was trying to find Jacob Mueller, the missing brother of Heinrich Mueller, whose complete story we trace in episode 15. Instead of searching blindly for Jacob Mueller and drowning in hundreds of results, I used the fan club method. Here's what I did. I found Heinrich in the 1910 census and extracted his entire neighborhood. 67 people from 23 households using Gemini 2.5 flash, a free AI tool. The extraction took 90 seconds. I organized all that data into a Google Sheet spreadsheet with columns for name, age, birthplace, immigration year, occupation, etc. Then I analyzed the patterns. Surname clustering. 3 Mueller families on one page. Birthplace patterns. 18 Bavarians living in a cluster. Immigration timeline. A wave of chain migration from 1898 to 1910. And there was Jacob, dwelling 203, age 31, Will Wright, immigrated 1909, living five houses from his brother Heinrich. But finding him wasn't enough. I needed to verify the relationship and figure out how to research these brothers further. That's where we're headed next. Using AI to create a research plan. Stay with me. This is where it gets really good. So, I'd found Jacob. I had strong circumstantial evidence he was Heinrich's brother. But, I needed more than patterns. I needed documentation. This is where I used a different AI tool. Perplexity. It's an AI-powered search engine that's really good at research planning because it searches the web in real time and gives you cited sources. Perplexity has a free tier and that's what I used for this project. Here's the prompt I gave Perplexity. I'm researching two brothers who immigrated from Bavaria, Germany to Pittsburgh, Pennsylvania in the early 1900s. I found them as neighbors on the 1910 census. Heinrich Mueller, age 36, blacksmith, immigrated 1907, naturalized. Jacob Mueller, age 31, Will Wright, immigrated 1909, not yet naturalized. I need to verify they are brothers and trace their family connections. What are the best next steps for this research? Please suggest, number one, specific record types I should search for both men. Number two, how to verify the family relationship. Number three, where to find records for Bavarian immigrants in Pennsylvania. Number four, any patterns or clues I should look for in the 1910 census neighborhood. Include links to relevant genealogy resources, if available, end quote. Perplexity searched multiple sources and came back with a prioritized research plan. Priority one, naturalization records for Jacob. Perplexity suggested searching for Jacob's naturalization records. Since he wasn't naturalized in 1910, but Heinrich was, Jacob might have naturalized later. Pennsylvania naturalization records from 1906 to 1930 often include the petitioner's birthplace and parents' names. It gave me links to Ancestry Collection and explained that Eastern District of Pennsylvania naturalizations are searchable. Priority two, World War I draft registration. Both men would have been draft age during World War I. The registration cards include detailed physical descriptions, height, build, eye color, which could help verify they were siblings. It pointed me to family search where these are free. Priority three, ship passenger arrival for Jacob. Since I knew Jacob arrived in 1909, I should search Ellis Island specifically for that year. The manifest might list a contact person in America, probably Heinrich. Priority 4, track through 1920 census. Look for both men in the 1920 census. Are they still neighbors? Has their family situation changed? Any new relatives who've arrived? Priority 5, Pennsylvania death records. Eventually, search for death records for both men. Pennsylvania death certificates from 1906 forward usually list parents' names, which would prove the relationship. This was exactly what I needed, a clear roadmap. Not speculation. Actual record types with sources cited and links provided. I didn't do all of that research last week. That would take more than one session. But I did follow up on a few items. I searched for Jacob Mueller's naturalization petition. I found it. He naturalized in 1915 and the petition listed his birthplace as Stuttgart, Bavaria, Germany. The exact same village as Heinrich. That was my verification. Same surname. Same village. Living as neighbors in Pittsburgh. They were definitely brothers. I also found Jacob in the 1920 census. Still living in Pittsburgh. Now with a wife and three children. And guess what? Heinrich is living three doors away in 1920. The brother stayed neighbors for at least two decades. This is how AI helps with research planning. Perplexity didn't do the research for me. But it saved me hours of figuring out what to search next. It gave me a strategic plan based on genealogical best practices. You can do the same thing for your research. Just adapt the prompt. Quote, here's what I found in the census. What should I do next? End quote. And perplexity will give you a roadmap. Alright, here's your homework for this week. As always, I'm giving you three levels. Beginner level. Find one census record for one of your ancestors. Use the prompt I gave you today to extract just that one household. Don't worry about the neighbors yet. Just practice the extraction process. Get comfortable with AI and CSV files. Intermediate level. Extract one complete census page for one of your ancestors. That's probably 15 to 25 families. Put it in a spreadsheet. Sort by surname. Look for patterns. Can you identify any potential relatives or interesting neighbors worth following up on? Advanced level. Do a full fan club analysis across two census years. Extract your ancestors' neighborhood from, say, 1900 and 1910. Compare who stayed, who moved, who appeared new. Track migration and settlement patterns. Then use perplexity to create a research plan for your findings. Share your results. I want to hear what you find. Come join our Facebook group, Ancestors and Algorithms AI for Genealogy, and post your discoveries. Did you find a sibling you didn't know about? Did you spot a cluster of relatives? Did AI make a funny mistake transcribing a name? Share it. We're building a community of family historians figuring out AI tools together. You can also email me at ancestorsandai at gmail.com. I read every email and use your questions to plan future episodes. If you're in the UK or Australia and you try this with your census records, I especially want to hear from you. Tell me what works differently. What challenges you hit. What successes you had. International genealogy matters to me, and your feedback helps make this podcast useful for everyone. That's the fan club method, friends. Friends, associates, neighbors. The people around your ancestor who hold keys to your toughest questions. Last week, I found Jacob Mueller in about 10 minutes. He'd been hiding 5 houses away from Heinrich for 115 years. I extracted 67 people into a spreadsheet using free AI tools. I spotted patterns of chain migration, surname clustering, and Bavarian community building in 1910 Pittsburgh. Most importantly, I built a research plan for proving Jacob and Heinrich were brothers, which I verified through Jacob's 1915 naturalization papers. AI helped me extract the data. AI helped me plan the research. But I'm the one who spotted the patterns, verified the facts, and built the case. That's how it should be. AI is your research assistant, not your researcher. Thank you for listening to Ancestors and Algorithms. If this episode helps you, please leave a review wherever you listen. It helps other family historians find the show. Like, follow, and subscribe so you never miss an episode. Join our Facebook group, Ancestors and Algorithms AI for Genealogy, where we can continue these conversations. You can reach me at ancestorsandai at gmail.com or ancestorsandai.com. Until next time, I'm your host, Brian. Happy researching.
