Navigated to 175: 5 Unique Data Analyst Projects (beginner to intermediate) - Transcript

175: 5 Unique Data Analyst Projects (beginner to intermediate)

Episode Transcript

[00:00:00] Here are five beginner data analyst projects that you can start today that will build your confidence and get you noticed by hiring managers and recruiters. Let's get into number one, but before we do, let me tell you that this video is brought to you by Julius, your AI data analyst companion. Connect to your database and or your business tools.

[00:00:32] Pull insights and minutes, no coding required. Thanks, Julius, for sponsoring this episode.

[00:00:37] For this project, we'll be selecting one stock and downloading its daily data. These are things like the open, the close, the number of shares traded that day. All sorts of different data, depending on where you get your data set. And we're going to analyze it.

[00:00:52] We're gonna try to figure out what happened with this stock. Did it go up? Did it go down? Did it go up quickly? Did it go down quickly? So on and so forth. And why [00:01:00] is this a good project? Well. Right now based off of my research that I've done on my own data job board, find data job.com, there is tons of financial analysts and business analyst roles open.

[00:01:11] And this type of project will not only teach you good skills to land those roles, but it'll also look really good on your portfolio when you're applying to those roles. So what skills do you need? Well, really you could do this analysis in almost any data tool you'd like, but Excel is a pretty good one because a lot of financial analysts and a lot of business analysts use Excel quite a bit.

[00:01:30] Inside of Excel, you're going to be needing to use things like graphing, so creating data visualizations inside of Excel. In this particular case, I recommend a line chart because stock data is very time series based, and line charts work really well for time series data.

[00:01:46] You're also going to want to be comfortable using formulas, creating new columns. Based off of the previous columns, formulas define the biggest ups and downs and changes in price, or maybe the average daily return. That would be things like the max, the min, the [00:02:00] average function. And honestly, you could probably do a lot of different formulas to calculate a lot of different things, the percentage up and down, all that sorts of good stuff.

[00:02:08] You might also want to create some sort of an X lookup or a V lookup system that allows a user to input a date. It would spit back like what the price of the stock was on that day, or maybe if the stock went up or down that day, or how much it went up or down. You could build some pretty cool VLOOKUP systems that I think a lot of users would enjoy.

[00:02:26] Now that's a little bit more of a beginner project. I'd say if you wanna make it a little bit more intermediate or maybe even in advance, here are some things that you could do. Number one, I think, is you could get a bigger data set. So instead of doing like daily, each row is a day, right? I think you could get maybe an hourly data set, or if you're feeling really brave, a minute data set, and you could do all sorts of different things with.

[00:02:47] The timings of it all, or you could just add more stocks. So for example, if you just picked Nvidia, maybe you add Tesla to it or maybe you add meta to it and you could do, you know, analysis of those three different stocks. Obviously the more [00:03:00] granular you get with time and the more stocks you add to your dataset, the more difficult it is to actually analyze.

[00:03:06] 'cause there's a lot more stuff going on. Um, maybe you could do something like calculating moving averages. So you know, this is how much the stock was today. This is how much was yesterday. This was the average, this is how the average changes over time. Uh, that requires like a rolling window average, and that's a little bit harder to do.

[00:03:22] Uh, and any software including Excel, um, or maybe you could even try to predict the future, you could even forecast here and use something like an REMA model, which is a a time series forecasting model that allows you to predict what the price of something or what the number will be in the future. So that's how you kind of take a beginner project and make it a little bit more of an advanced project.

[00:03:45] By the way, if you're not sure where to get this data set or any data sets that we're going to talk about at all, I made a full tutorial with ton of options for you to look through. You can click right here on YouTube right now and watch that next, or you'll find it in the show notes or [00:04:00] description of whatever podcast you're listening to.

[00:04:07] All right, project number two, and that's going to be analyzing real estate data. And the reason I really like this project is because this industry, it's never going away. Literally, it's been here forever. It'll always be here. It's going to last the eternities, like if there's anything left on planet Earth after AI destroys us all, it's gonna be real estate.

[00:04:28] Uh, and I think it's also fun and personal because you can actually like look at your neighborhood or your state and kind of compare it to different things around it. So, uh, I think projects are good when they're fun and they're personal to you.

[00:04:42] So for this project, I would try to find some sort of a real estate data set. Once again, if you don't know where to go, go to the show notes and I have a link to find these different data sets. Most states have something called like. Open data or public data where they are required to have some data be public.

[00:04:58] And oftentimes that is [00:05:00] really real estate heavy. So I know here in Utah we have like kind of an open real estate data set system that allows us to download from, I'm sure New York and California do because they are very public with their data sets. They have like so much data you can download from them.

[00:05:13] Anyways, Google try to use the resources I gave you and find a cool data set with real estate involved for this particular. For this particular project, we're going to be analyzing the data inside of sql. Could you do it in Excel? Sure. Could you do it in Python? Sure. But we're just gonna do SQL for right now.

[00:05:28] So the skills that you're going to need in SQL in order to do this project are, number one, getting data into sql. It's highly unlikely that you'll be able to download a SQL data table or a SQL database. From the internet, oftentimes it is gonna be stored as Excel or as a CSV. And so one of the first skills that you're going to need to know how to do or at least learn how to do, is how to get a CSV.

[00:05:50] Dataset into like a SQL database. And it's a little bit tricky. It's a tutorial for another day. Comment down below if you're interested in me doing a tutorial and how to do that. Um, [00:06:00] but that's the first skill you're gonna need. You're gonna need to know, select from where. That's like a really good start, right?

[00:06:06] Just like how do I get the data from the dataset? How do I filter with where group by order, by and having, so how do you. Group, how do you order or sort your dataset and then having, like, how do you do, um, wear clauses or I guess like really filtering based off of the group buys and aggregations. So I think that alone would be pretty interesting.

[00:06:28] You could do things like, you know, what's the most expensive, uh, house in all the state of Utah. You could do something like, what's the average price of a house? And each county or each city, um, you could do, you know, what are the, if you, if you took the all of the houses inside of one city and you added it all up, what's the most expensive city to buy?

[00:06:50] If I was like an overlord king with millions and bajillions of dollars, there are all sorts of different things you can kind of play with, with the house price, with the county or the [00:07:00] city or even the state or something like that, equivalent to some sort of location. Often data sets like this will often have house features as well.

[00:07:07] Maybe how big the lot is, maybe how big the house is. Um, sometimes you get lucky and you can get like really granular, like, does it have a pool? Yes or no. And so you can do interesting things like what's the average price of a house with a pool versus the average price of a house without a pool, those types of analyses.

[00:07:23] Um, and I'd say that's like kind of a beginner leaning to intermediate projects. Um. If you wanna make it more advanced, once again, another thing you can do is add more data, right? That's one of the things that makes things more complicated. So in this particular case, I would add it with another data set that would be like a supplementary or a complimentary data set that would give you more information.

[00:07:45] Maybe it's more information on the houses that might be hard to find. Something that might be a little bit more easy to find is like crime. Like a crime data set. Um, and you could compare house prices to how much crime's going on, right? Like. Theoretically [00:08:00] the more, the more crime there is, the cheaper the house.

[00:08:03] I don't know, that might not be true. Like Manhattan's really expensive and there's a lot of a crime there. Like that's the type of analysis that you could look into. If your data set is a little bit more rich, you could probably do something like a window function, which is more advanced to find like the three most expensive houses in each neighborhood or city or county or whatever.

[00:08:20] Or if you have historic data that like shows you how the price of a house has changed over time, you could do some sort of a window function that would show you like averages rolling over time or some sort of like time analysis inside of sql.

[00:08:35] All right. Project number three is creating a personal financial dashboard inside of Tableau or Power bi. And I like this project 'cause once again, it's personal. I think that we care about projects more. We're more likely to get projects done when they are personal.

[00:08:48] And also, I think this is useful to you in your life, um, unless you're like a budgeting expert. Then you might not know where your money is going. I know I could be a little bit better at that for sure. And [00:09:00] so downloading your data historically and like analyzing it to try to see what you spend your money on, I think is a really good exercise that everyone could benefit from.

[00:09:08] In the past, I've been able to use something like mint.com to hook up all my credit cards and bank accounts and have all the data kind of be aggregated in one place, and then download it as like a CSV. Um, I know mint.com got shut down now.

[00:09:20] So just like google mint.com, download data alternatives or something like that. And there are. A lot of different options. I haven't tried them all, so I don't know the pros and cons of each one, but I'm sure you'll be able to find one that would aggregate all of your credit card, debit card bank data, and spit that out to you.

[00:09:35] Once you have that, you can upload it to Tableau or Power BI and create a pretty cool dashboard. Now, of course, do this project. You're going to need some sort of data visualization skills. Number one, you need to be able to create some sort of graphs. In this particular case, I think bar charts would work really well.

[00:09:50] I like donut charts for, uh, like. Categories. Like you could, you could do a pie chart, but pie charts kind of stink and people prefer donut charts. Um, you could do some sort of line [00:10:00] chart if you have any sort of historic data, like how much you're spending has changed over time. I know that like I obviously didn't spend more money.

[00:10:08] I know that personally I'm spending more money today than I did when I was in college. Right. I have a little bit more income that I could possibly spend on dumb things like Pokemon guards. Um, you're going to be able, you're going to. You're also going to need dashboard skills, and that's basically like how do you tie these graphs together, uh, across one canvas to tell an elegant story?

[00:10:29] And that's the, actually the third skill you're gonna need is data storytelling. How do you tell an interesting story of your personal finances? On Tableau or Power bi. And just one note with this, obviously you might not want this to be publicly available or you might want to like change the numbers somehow or just make sure the data's not like downloadable because you might not want to give the whole world your transaction data.

[00:10:50] Maybe you do, I don't know, but just something to think about.

[00:10:56] Now I think this project would be kind of a beginner [00:11:00] Tableau Power BI project, but if you wanna make it a little bit harder, look up what a tree map is, uh, and try to make a tree map that makes sense. Tree maps are like a more advanced data visualization tool, uh, that I like to use. That kind of shows.

[00:11:12] Categories of categories and sizes and how much of a total something is. I, I really enjoy using them. I think they look really cool. Um, they're a little bit hard to understand. So it is a little bit more of like an advanced state of vis technique. Uh, you could also try some of the built-in forecasting tools inside of Tableau or Power BI to predict future spending, but that.

[00:11:30] Also depends on what version of Tableau and Power BI that you have. I know that all of them have that forecasting tool. Um, if it does, you could also look into seasonality effects. So like for example, I know I spend a lot of money in like November and December kind of gearing up for the holidays and Christmas and stuff like that.

[00:11:46] And that would be kind of interesting to have that data reflected inside of, and it would be interesting to have that trend reflected inside of the data.[00:12:00]

[00:12:03] Overall, I think this is a great project because it'll get you excited. I think in an interview it speaks really well of you 'cause you're like data driven in your real life. It's like you understand the power of data and how it can help you make better decisions and I think that's what recruiters and hiring managers are really going for most of the time.

[00:12:20] Project number four. All right. We have talked about stock data, real estate data, and personal finance data. We've done a lot of money, which is great. Like I said, there is a lot of financial and business analyst roles that are opening up. But let's switch it up for Project four and Project five and do something a little bit more fun because I think data can be fun.

[00:12:38] And we're gonna talk about one of my hobbies here with project number four, which is Pokemon analysis. And you can do this in Python. Now, why is this a good project? Well, once again, I think if you think something is fun, you're more likely to spend time on it and it's more likely to be less frustrating, uh, and you're actually going to accomplish getting the project done.

[00:12:59] So, [00:13:00] uh, it doesn't have to be Pokemon, but something you're passionate about. And I'm personally passionate about Pokemon. So once again, I would go to those resources. I linked earlier to find a cool Pokemon data set, and I think this is. Would look good on, and I think this would look good on a resume just because once again, you're showing recruiters and hiring managers, you care to analyze interesting things and you're able to pull insights out of any data set.

[00:13:19] Doesn't matter what type. So for this, you're obviously going to need some Python skills. The number one python skill that you can learn as a data analyst is going to be pandas. That's basically the library for data cleaning and data manipulation, getting your data into Python. Doing different operations, like creating new columns with formulas, uh, filtering your table based off of different conditions, that's all going to be handled inside of pandas.

[00:13:46] So that's number one skill you're going to need. Number two skill that you can learn in Python as a data analyst is going to be data visualization. So I really like map plot lib. So the most common Python library to use is map plot lib, [00:14:00] and it's great. It's also a little bit basic. I prefer something called seaborne instead.

[00:14:05] So for this analysis, you can answer questions like, well, how many water Pokemon versus how many? For this analysis? Alexa, turn off my plug. Okay. For this analysis, one of the first things you can do is compare and contrast different Pokemon types. For example, how many water Pokemon are there versus how many grass Pokemon, if.

[00:14:33] Then you could look into secondary types. Well, how many Pokemon have a secondary type and what's the most popular secondary type?

[00:14:43] Who are the top five strongest Pokemon in terms of attack? And who are the top five weakest Pokemon in terms of defense? You could even take the height and weight of a given Pokemon and see how that affects their speed. Defense or attack are the [00:15:00] bigger, taller Pokemon Al. Do the bigger, taller Pokemon always have stronger defense?

[00:15:06] And are, are they always slower or is that not the case? Are there any outliers? And of course, if you wanna make this less of a beginner project and more of an advanced project, I actually think this would be a great use case to break out some basic machine learning, specifically clustering. If you've never heard of clustering before, it's basically like.

[00:15:25] Hey, how many natural groups are in my data set? So in this particular case, it'd be like, Hey, if we had to put a number on how many groups of Pokemon are there, what would you say? Uh. What would you say that number is, and what group does each Pokemon fall into? So for instance, how many natural clusters are there?

[00:15:45] Do they belong? Do they follow the type family that the Pokemon company assigns each Pokemon? If a Pokemon is two different types, does it follow into each cluster equally? You could if you wanted to even dive [00:16:00] into dimensionality reduction and do something like principle component analysis, other words known as PCA, and see visually how these clusterings all go together.

[00:16:11] Now, if that sounded like Japanese and you're really scared, 'cause like, I don't know what Avery was just talking about, don't worry, that's for my advanced people who are watching this video and just move on. In fact, let's all move on together to project number five, which is let's look at some football data and for this project you can.

[00:16:34] Now, why is this a good project? Well, one because I love football and I think it's fun to analyze football. But two, if you wanna work in sports analytics, this would be a great project. Or once again, it just shows that you can take a data set and extract insights no matter the subject value. So for this project, I challenge you to use any.

[00:16:51] Don't use necessarily just Excel or just Tableau or something like that. Just pick one of them and go ahead and do the analysis. 'cause it's less about the tool [00:17:00] and it's more about how you think as an analyst, you're probably going to do like some sort of aggregations in this analysis, right? Think questions like.

[00:17:08] Who did the most, who did the least? You'll probably create some simple formulas like, Hey, what if we took the catches or what if we took like some sort of stat here and divided it by the number of plays on the field? What would that tell us? What can we call that metric? And you'd probably do some sort of a time series.

[00:17:25] So how does this player change over time? And you could do all of those in Excel, sql, Tableau, it doesn't really matter. Just do those things. Heck, you could even use a tool like Julius AI if you wanna get AI involved to do this analysis really quickly. So for example, one thing you might be interested in, especially if you play like fantasy, is who had the most fantasy points last year per play on the field?

[00:17:50] Right, like I don't know who had the most fantasy points last year. Maybe Lamar Jackson or something, but he's on the field the whole time. Are there any outliers that had a lot of fantasy points who weren't [00:18:00] on the field all that much? Those might be people of interest for this year. You could do that and, and instead of doing fantasy points, you could do that with rushing.

[00:18:10] Instead of doing fantasy points, you could do rushing yards or you could do catches or something like that. It doesn't have to be fantasy points. You could change that metric right there. You could look at, well, how many yards you could look at, like what quarterbacks had the most yards. How many more yards did Josh Allen have than Patrick Mahomes?

[00:18:30] And of course, if you wanna make this more advanced, once again, the way to do it is add in more data. One of the cool things you could do here is add in multiple seasons, because anything can happen in one season, but if we add in 2, 3, 4 seasons, that starts to be a trend. Instead of just looking at player data, you could also probably look at like team data, and that becomes a little bit more complex.

[00:18:50] Maybe you can combine the two together and get some interesting insights. And lastly, and this is the hardest part, is you could predict how [00:19:00] each player will do this year. It's impossible to know, but if you want to take on a fun task, this would be a great feature of this project.

[00:19:14] By the way, each of these projects is something that you and I could definitely work on together as a capstone project inside of the Accelerator, which is my data analyst bootcamp program. In the program, we'll walk you through seven guided projects and then help you build an awesome capstone project and that we can put all of those on your portfolio and we'll actually help you build the portfolio.

[00:19:37] I'm gonna re-say that part in the program. We'll walk you through seven guided projects and then help you build an awesome capstone project. And then we'll put all eight of those on your data analyst portfolio. And if that's of interest to you, go ahead and check it out@datacareerjumpstart.com slash daa.[00:20:00]

[00:20:00] And actually, I'd love to do one of these projects here on YouTube or on the podcast. I'd love to build one of these five projects. So if you're watching on YouTube or you're listening on Spotify, please, please, please leave a comment below this episode with your favorite project idea and which one we should do a full tutorial of how to get started.

[00:20:19] Whenever one gets comments the most, we will call the winner. Sounds pretty fun, right? So leave your comments down below.

[00:20:27] And lastly, if you want even more data, project ideas, then go ahead and last. And by the way, if you want even more data, project ideas, then you're going to need to subscribe to my newsletter, the Data Career Newsletter. I try to include one project idea every week.

[00:20:50] I try to include one data project idea every single week, so subscribe. It's a hundred percent free. See you the next video.

Never lose your place, on any device

Create a free account to sync, back up, and get personal recommendations.