Charlie Yielding and Charlie Apigian revisit the topic from Episode 01, examining how the perception and application of data have fundamentally changed over the past year.
The new meaning of data revolves around the idea that data is no longer just a static collection of facts and figures. Instead, it has evolved into a dynamic resource that can be manipulated and interpreted in various ways to generate insights and drive decision-making. This shift in perception is largely due to advancements in artificial intelligence and cloud infrastructure, which have made it possible to collect, analyze, and interpret data on an unprecedented scale.
Data 4 All Recommendations and Links:
Find full episodes and clips at our Youtube channel at https://www.youtube.com/channel/UC7JZl_CnkSnGOw6ocQUKZcg
Data 4 All Social Media Links
Charlie Yielding Social Media Links
Charlie Apigian Social Media Links
[00:00:00] We are officially recording. Yeah, we could just do a soft roll into the, into the thing. I, we don't have to though. You want to? I like the idea of, um, starting the podcast by talking about the, uh, like for me, Notebook LM is definitely a situation where I have had to rethink, like, how I collect and how I, um, set up data and like what data is.
What data I have access to and what it means to me and stuff, because I'm essentially creating a, uh, a bot that's an expert on really small subjects and stuff, or really tight defined subjects. And so the, uh, the change of like what data was to what data is, you know, just from the last time we talked about it, it's just so, I don't know.
It's just so different. And it just, it feels obvious that we're redoing this right now because the, [00:01:00] um, Like, our perception of the data we work with is, like, fundamentally changed, I guess. So, then why don't we do a podcast on that? Sound like a good plan? Let's get to it then. All right, we can just do the intro.
All right. I know, I think that's going to end up being the, uh, open. And if I had this open, okay, you're, you're the one starting this time. Mm hmm. All right. Well, I'm going to clap now, but I'd love that to be the intro. So, here we go.
On today's episode of Data for All, the new meaning of data.
Welcome to the Data for All podcast. I'm Charlie Yilditch. And I'm Charlie Apigian. We want to empower you to think different with data. And on today's podcast, we're revisiting the [00:02:00] topic that we brought up in episode one. Number one. And that's what is the meaning of data? And we're going to dissect it to address the changes and advancements over the past year, because I got to tell you, what we started with and where we're at right now are two totally different things as far as like, what, what is data and how can we apply, how can we apply it to our everyday lives?
Yeah, and you know, obviously, I've hinted at this, I think both of us have a lot to say about this topic. At the core of all of this is the advancements in AI, in cloud infrastructure. Uh, tech startups out there, all of them are doing a lot with data, data. And so that's really what we're wanting to dive into.
But before we get to that, Charlie, um, you know, the holidays are now over, by the way, I'm going to CES in a couple of days. I know. And so I'll have a lot of really cool stuff, uh, to bring back. Okay. Um, but I want to know, are you playing with anything new? You kind of [00:03:00] mentioned, uh, a notebook ML at the beginning here, uh, LM.
Yeah. LM, LM. Oh, that's the ML for machine learning is what I think. Yeah. Yeah. I think it's notebook LM for language model. I'm not, I'm not a hundred percent sure. Uh, but I've been playing. Yeah. I've been playing with, uh, with that quite a bit. I'd signed up for the beta. I brought it up in the, uh, the Christmas episode is like a spare app that I'm, I'm playing with and whatnot, or just a new app that I'm playing with.
And, uh, It's turning out to be really, really useful. And so, uh, Notebook LM, for those of you who don't know what it is, it's just an, it's an application in its beta format from Google. And, uh, it's like a fancy notes application. So you can, um, just like with ChatGPT, you've got a text box that you can ask any questions of, but when you first get into it, there's no, uh, there's no information.
So you've got to add sources. So you can add You can add, uh, PDF files, you can add text from, uh, uh, websites, and you [00:04:00] can, um, you can paste things in as well. And so there's, there's a couple of different modes and they're going to be adding on to that later as far as how you get stuff in. You can just type stuff in as well.
And Notebook LM, is that a certain company? It's Google. Oh, it is a Google thing. Yeah, yeah. Oh, yeah. I mentioned that earlier. Yeah, yeah. Sorry about that. Um, I have been playing with perplexity more, um, and so I started using it. You know, being the academic, I like that it gives you the sources, and now they've got a browser extension, so you can go into it, and, um, if you get to a certain page, you can ask it a question and say, based on this website, this page, or just in general, and using the browser, uh, extension, you can ask any question, just like you would in ChatGPT.
That's cool. Yeah, and it does a real, I like the way that it organizes, it'll give you the same answer. And then below that, it'll, it'll look like almost like a Google search and give you some of the different things. Um, so I really like what they're doing lately. I like how perplexity displays things once [00:05:00] it, once it spits it out, cause it'll give you images.
It'll, it, it, you kind of hit it on the head. It's like a, it's like a search engine results 2. 0 if you will. Cause it gives you a little bit of videos, a little bit of images, a little bit of background and stuff like that. I still use perplexity just to look up people and actually referred, uh, referred it to a family member who's like, uh, you know, Some people out there are internet sleuths.
I appreciate you. You keep us all straight. Uh, but every time they meet somebody, they go, you know, scour the internet for the, you know, basically for their presence. And that's the one thing I like about Perplexity is I can just type in, like, tell me about this person from this place and it's like, here's everything on the internet that we could find.
Right. Yeah. Uh, and, um, It's scary though. Uh, you know, you mentioned search engine. They actually just received a ton of money to create their own search engine that they think will be able to compete with Google. Uh, it Yeah, and so if we're going back to predictions, I'm gonna be, I think I'm going to get, that's going to be a prediction that's going to be easily attainable [00:06:00] at this point.
Yeah, and which one was that? That search engines and assistants will be replaced by Gen AI, or empowered by Gen AI. Yeah, in a completely different way, but Uh, what Perplexity does is also a little bit familiar as well, because it does give you the links, where some people I think are still, uh, thinking of, you know, chat GPT is, it gives me an answer, but I have no idea where it comes from, so why would I trust it?
Well, so that's a good point, because, uh, Perplexity does give you links that you can click on and stuff like that. I mean, GPT will do that as well in the, in the, uh, paid version. Excuse me, but, um. There was one thing I was going to say about the perplexity thing. Oh yeah, perplexity you can just do single sign on through your Google account and stuff.
So it's really easy to get into and it's free as well. Yeah, so I'd say that's been the one thing I've probably been playing. And people when they ask me what should they be using, I'm usually asking them to look at [00:07:00] that. I actually just had a law professor reach out to me and ask what should he do about Um, and so I gave him a pretty long answer.
Um, and you just, you know, you didn't tell him no. I told him embrace it because every one of them, they better be using it. Well, they, so LexisNexis and Westlaw both have AI components in them and you can write your briefs from there. And so you can't like at this point, the two major, uh, uh, legal, like reference pieces of software.
They already have AI in them, so. And, and for that reason, um, doing their best, you know, what they're going to have to do is be better at understanding what is real, what is not real. If there's hallucinations, that, that they have to be the ones that are responsible for it. And then it's more of teaching people the right thing, and so if you do have it completely written by something else, just like if you had it written by another person and you don't [00:08:00] check it.
You're responsible. No, that reminds me, I got to tell you about a hater I met this past week, but I wanted to go back to the notebook LM real quick because I didn't tell how I was using it. So a buddy of mine reached out, who's an executive at a company who was just like, Hey, can you write me a paper on, or can you, can you give me a brief or whatever you want to call it on, um, like AI's impact in 2024.
And so I thought I was, you know, I was like, I'll just look up a couple articles, read those, see what they're saying that may be different from what we're saying, because we talk about this every day almost. And, uh, I was pleasantly surprised to find that there wasn't anything that was too out there that, like, we hadn't considered before.
But I was also, uh, very underwhelmed with the amount of information with regards to, like, what we're actually gonna do. You know, the action steps that we're talking, that we've talked about before. So we're building insight. But we're not doing anything with it yet. There's nobody on the internet actively saying it.
So, with NotebookLM, I took all of these articles and posted them in there. And then I built some of my own [00:09:00] notes and put them in there. And then I'm, I'm, uh, It's making it a lot easier to wrap my head around like what the industry's saying versus what I'm saying. And how those things fit together. So I can build a cohesive document that I can share with my buddy.
But, a little five hour Project has turned into quite a bit more because I was surprised by the lack of certain types of information on the internet. Yep. Yeah. And, uh, you know, that reminds me of when I, I started writing with an app called Scrivener Mm-Hmm. . And, and it, it allowed me to write in chunks and I could add in references as a separate thing.
And it was just a big app. I mean, this was 10 years ago, I started using it. Uh, and it was only a Mac app at the time. Uh, but now what it's, what you're doing is, is a lot more. Um, uh, uh, that on steroids for sure. Yeah, exactly. Um, yeah. Is that the one that makes your screen just black and white and you type like you're writing in DOS?
No, uh, there's a couple of those. No, this [00:10:00] one, it, it really is you're writing like small chunks and then you can have referenced, uh, material in there and it almost like creates everything as a small note that then can be, uh, brought together. Yeah. Um, so, um, Yeah, I haven't used it in a while. I don't even know how much it's being used.
Um, but ones you're talking about, like IA Writer, Uh, ones that are more like, uh, just simple Markdown. Yeah. Um, I love writing like that. Is Markdown an application? Markdown is a way to write. Okay. Um, and so it's where, if you ever see where, uh, somebody uses two hashtags and it counts as a heading two, that's Markdown.
So I, I use Markdown a ton. I've always liked just in your formatting. I do because, um, in code you can use it, especially when you're using like Jupyter notebooks for, uh, the data science stuff. I'm fully aware of that. Yes. Well, that's what I use to write my Python [00:11:00] code. And I can add Markdown around it. So I can add a heading.
Yeah. And it says, this is a regression model. And then underneath that, I can change it. I can add, uh, hyperlinks, everything all in Markdown. Um, and, and it looks then good once you, you know, put it into, uh, presentation mode. Um. And then I have it, when I do things like that, then my code ends up being my textbook for students.
So a student will get code, but it'll tell them, okay, it's a regression model, and then it gives them a full description of what they're doing, and then it shows them the code. Um, so quick question. Do you, if you're coding for yourself, apart from class stuff, do you notate your own code? I do. How much? Uh, not enough.
Not enough, okay. Nobody, if anybody ever says they do it enough, they're lying. Well, I've talked to people who say that any, any notation is just an, uh, is just, uh, it's a distraction from the actual work you're [00:12:00] doing. Uh, that is true. If you write good code, you should not have to notate. Except if you're teaching it to other people.
Right. Then you want to. So what I do is I use Markdown, which is outside of code. Um. To explain things, but then if I'm explaining code, I write comments. Yeah, and and and for me It's two things. I have amnesia. So in two years when I look at it again, I want to know what the heck I did Right. Yeah, but more importantly I'm usually writing very simple Syntax because I want students to be able to understand it.
Yeah, a lot of times the the way I see it is And I know we're going on a complete and total tangent, but the, uh, when, when people are writing code and they don't write comments, I find that to be inconsiderate just in general, because that's assuming that you're the only person that's ever going to look at this code.
And it's assuming that you're never going to change your style of code. So even to your point, so if you're going back to look at a piece of code, you wrote four [00:13:00] years ago, what styles have changed? Like what slang is different than it used to be in stuff. And you're into your, also to your point, you like, it helps you catch up.
When you're like, oh, yeah, that's right. This is the way it was as opposed to let me figure out Exactly why I did it this way so long ago and then waste my time Yeah, for sure. And uh, so anyways we could I could talk code all day long, you know that I like comments and I like I like having arguments about comments and i've never commented commented a single line of code outside of college uh, and uh before we get into our topic today, which is the word data, um I thought I would, uh, bring up and thank you for the nice, uh, little video you sent my mom.
My mom, over the weekend, turned 80, and I bring it up only because she's been a guest, of course, on the podcast. Um, and, uh, I think she's gonna, my parents are gonna move to Nashville. No freaking way. My mom is dead set on it. Yeah. And my dad was dead set [00:14:00] against it. Right. So I just spent four days with both of them.
And, um, my dad was, no way, no way, no way. I then took all of my nieces and nephews and my kids out, um, Um, for my son who turned twenty one two days before that, you had a big week last. I did. Um, and Armenian Christmas is the January 6th. So we had three technically events, uh, all in the same timeframe. Hang on.
I don't understand. We'll come back to that though. Uh, well, anyways, so we go out, my wife stays home with my parents. By the time we get back home, my wife goes, they're moving and they're convinced and we're all set. And I'm, I walk in and my dad goes, We're moving to Tennessee, and I was like, oh my gosh, and so, then he asked many, many questions, all really good ones.
The fact is, uh, I am incredibly excited. I love hanging out with my parents, they're amazing, um, and now if they're in their 80s, I really want them close [00:15:00] to me. Just remember when they get down to always, always talk trash about the fact that they didn't come down when the kids were little. They waited for the kids to grow up, then they came down.
Well, you know, we, we do, I do have a sister, and she was up in Michigan, and that, my dad, I, it took, it took too long, but at least it might happen, and it might happen this year, and that would be amazing. Well, I'm excited. If they need help getting stuff in and out, let me know. Well, uh, you've got that big old truck.
I might need ya. I do. Yeah, for sure. I have, I have a truck, and I have, uh, two arms that can lift things. That's, well, But I don't pack. No, no, we got a lot of, we got a lot to do and, uh, it's, uh, it's, it's going to be good. Uh, uh, so I know that you said you wanted to get back to something. Congratulations. Uh, yes.
Armenian Christmas is January 6th. It's always been that way. What's the deal with that? Uh, that is, uh, probably closer to the actual birth than December 25th. And that's all I know. And that's all the, uh. [00:16:00] Well, some Armenians probably got a stone tablet with, with the birthday on there. It probably does, and it's been passed down from generations and all of that stuff.
It's always been that. We celebrate on December 25th. Mm hmm. And, and, and we're, uh, and, and that's what we normally do. So it's, it's, it's not as big a deal, but, you know. I'd never heard of that, though. That's cool. Yeah, and I, I think, uh, some will have January 7th. Others, there's other, uh, uh, I'm just thinking about how stubborn Armenians are to just flat out refuse.
Is this like the rest of the world does Christmas on the 25th? Nah, we're doing it That's the way it's always been done and nothing changes in our in the Armenian culture. Let me tell you that. All right. I, you know, we're here to talk about my favorite topic and, um, and, and I think it's obvious why we need to revisit this.
First of all, we should. Well, what do we, what do we revisit revisiting again? So we're looking at the new meaning of data or the meaning [00:17:00] of data with. I think we've all gotten a little confused, right? I mean, kind of. Um, we hear about, um, uh, large language models being trained on the internet. What does that even mean?
What is a parameter? What is, uh, when you hear about the, uh, 1. 7 trillion parameters, what does that even mean? How does that equate to the human brain? Um, and then at the end of the day, we forget that if you don't have good data, you don't have good answers. Yeah. That's the, that's one of the things that I want to hammer home today is that we.
Our perception of data is definitely changing. Maybe the definition, not so much, but the way we, uh, the way we interact with it, like, has to be kind of forward thinking. Yeah. Because that's, that's the, that's just the world we're going to live in. It's the, the data we're creating now is going to lead to a more successful future.
Yeah. Or the, the cleaner the data that we're creating now leads to a more successful future. And I also [00:18:00] feel that my definition of data has it, has changed. Banded. Mm-Hmm. , now that we have artificial intelligence and humans separated, oh yeah. I'm interesting in this. So you, you know, if you think about, and so what, do you wanna get into that now?
Or do you wanna get into No, I want to get, uh, not, no, let's not. Okay. Let, let's, let's get into that as we get going. So I think it's obvious if I was to say my definition of data, uh, and let's go back to it. We, we said it in episode one. Yeah. I'd rather not change it. Um, I think, I think it's, it still suffices from a simplicity perspective, right?
And it, and it, as simple as I'm looking at it so I don't screw it up, it is a capturing of an imperfect view of reality. And so I want you to think about reality. We are, we use data to tell us something about reality. That's, that's the point of, of, of data. Right? Um, and, and there's so many nuances to that as we think about that.
Now, we take data and we capture it, we process it. And [00:19:00] sometimes we store it. Well, I think, let's go back to what we, like the, the first, our first definition of data. It's an, it's a, um, the capturing of an imperfect world or an imperfect view of reality. Um, but one of the things that we talked about was like is our, our numbers data, our words data, and all this other type of stuff.
But I feel like what in, what is data? It could be more clearly defined because like one of the examples we talked about last time was the cave painting. So the cave painting is data. You look at it and you say, okay, well back in, you know, this time period, this animal lived or an animal approximately shaped like this.
And then, but then not only that, you're saying, okay, how does this, how is this pigment created? How, what kind of brushes did they have? There's lots of things that are kind of metadata. And then, but then there's the actual, uh, like data that it's supposed to be conveying, which is like, there's an animal here that we probably eat.
Yeah. [00:20:00] And remember, the data is different than the information. Remember we did that DIKW model? So we had, uh, data turns into information, which means it's formatted, turns into knowledge, which means we're making predictions, we're seeing a trend, and then if we do something with it, it turns into wisdom. I look at a cave painting now as probably information.
that represents data, right? It's a, it's a visual depiction of the data from that time. So the cave painting itself? Mm-Hmm. . But it is data. It is, it comprises data. The, the meta part of it is data. Oh, absolutely. Yes. But the, like, it's kind of like the spreadsheet, right? Mm-Hmm. , uh, the spreadsheet itself is information that, in, that comprises a set of data.
Yeah. So it's a data set. So a cave painting is a data set. Um, uh, a plot that is visually depicted through a painting. Um, yeah, I think that works today just like pictures do. Yeah. So, a photo, a photo [00:21:00] is data of a specific thing. So, if I take a picture of you, the data is that there, you know, here's what you look like, here's what you're wearing, you know, here's your scene and all this other type of stuff, but if you, if you swipe up or if you look at the, the information on that, then it tells you, well, he was at this location.
The picture was taken at this time. Like that spreadsheet stuff that you can put in. But it doesn't really encapsulate the context of the photo that you're looking at. Right. Now, let's take the image. What is the, because the image is data. It is literally data. Uh, and, and you could almost say the pixels within the image is also data.
But the, what you're seeing is, it is a proxy for reality. The reason you take a picture is to capture reality at that moment. Mm hmm. Um, how many times, my kids will do this, I think we did it over the holidays, where my, I'll say, uh, we'll play games at dinner time, and I'll say, what's the earliest memory that you can think of?
And almost every time, it is something where [00:22:00] there has been a picture of it. Oh, so from, okay, alright. You know what I mean? So they do remember it, like my daughter talked about, um, being in a dance recital. And she was wearing this little white with, of, uh, uh, Red Polka Dots and my, my parents drove all the way down just to see her up there for two minutes and she remembers being on that stage and all this other stuff.
But I think it's because it's the picture. She remembers the picture, the picture triggers the experience. Um, so, you know, if you go back, like I can go back to, uh, certain times when I was really small and I remember it. But what I've remembered it if there wasn't a picture of me from that time and I said I can see that picture and then I see the experience so that capturing of reality at that moment is a way to capture that experience.
It will kind of except for we're so flawed that like we will literally make up an experience around the picture, which goes to [00:23:00] hallucinations these days, right? So is your reality is your memory of the reality of that situation accurate? Nobody knows nobody knows that's true unless there was a video of it and now that's better data, right?
Exactly. Uh, and so now we're able to capture, uh reality in a better way And so what has really happened over time? It's not that we have more data. It's we've captured we can capture data that allows us to process record and At the end make better decisions. Well, yeah, okay, so I feel like you hit right on it, but on The You're correct.
It, uh, kind of, at least from, 'cause the way I see it and, and maybe, maybe I'm missing some of the nuance to what you're saying is that like we are most certainly are generating more data than we ever have before. We are capturing more, and yes, we are generating, we're doing more stuff. But the, the main difference now is that the, uh, the tools that we have access to, to touch as much data as possible to make [00:24:00] decisions.
Mm-Hmm. is changing. So, so it used to be if I was, like, we were talking about the notebook LM thing earlier. Like, if I was writing an article on five different articles, I'd have to manually engage with each article, as opposed to ask all five articles at one time a question. And so it's the same thing.
Like, I can take, uh, like, say if I've been journaling for 20 years, or from this point I journal for 20 years, I'll be able to touch all 20 years of journaling at once. Whereas two years ago, it would still be a manual, like, I've got to go in and I've got to figure out a way to process this stuff, and sure, I may be able to do it programmatically, but there would be a lot of effort into doing that, and then the second something changes, it breaks.
Whereas right now, we have a malleable, uh, way to access huge amounts of data as an individual to make better decisions, which is the point you're making. Access data, process data, store data. Yeah. But does that, I'm going to go a little theoretical here for a second. Does that mean the data did not exist?
It's kind of like if the [00:25:00] tree doesn't make a sound, if you're not in the woods, does that mean it actually fell down, does it make a sound, and I know we can get into all that, but is, so for example, when I went for a run 30 years ago, and all I had was a Timex watch, the only thing I was capturing was time, I wasn't capturing my heart rate, I wasn't capturing steps, I wasn't, none of that, now I'm capturing all of that other data, I'm capturing that data, but does that mean I did not take steps?
I just wasn't capturing it at that moment. So we are better at capturing and processing data. Right. I think we're also creating more data because there's more things happening. Um, but, For sure, but it's not, it's not, it's not like, To the point you're making earlier though, It's not like game changing as far as like what we're capturing.
Right. It's just how, how we can capture it and then have access to it later. Yeah, and it really just comes down to, Is data something. And again, remember we always talk about it as a [00:26:00] singular thing because it's a concept We're not going to talk about it as a plural. Yeah, um, so i'm going to say data is yeah.
Yeah No, i'm with you on that one. Okay, uh, and uh, uh, but with that If we don't store it, is it still data if we don't process it or or is our definition something that it has to be captured? Um, because we do say it's a capturing of an imperfect view of reality. So um is that But instead, are we saying that's stored data, or is that just data?
Well, we talked about this a little bit in the, in the first episode, and like, I still kind of feel the same. Like, if, you know, this conversation, unrecorded, still is creating data and data points, uh, but it's just in here. It's being processed. It, true, but like the, I don't think that your brain or anybody's brain can be trusted a short period of time.
And so, you know, if I were to, if I were to go home and then write out, you know, do a [00:27:00] journal entry of my experience of this situation, I would trust that. But if in two years I were to do, I were to do the same thing just off of the top of my head, I would not trust that. And so there's a, there's a clear, um, degradation in quality if it's in your head, but it still counts.
Uh, you know, like how long, I don't know, and you know, that's individual dependent and to your point earlier, is there something to reinforce that memory, because apart from like photos helping, you know, stimulate memory, because it, memory is a path, and so your photo can help establish that, you know, maintain that path, and so maybe they're actually recalling it better, or they never let that instance kind of like disappear, um, uh, from the, you know, mental map, but then also like trauma can be, A big part of what, you know, causes people to remember things.
So, one of my first memories is falling, smacking my head on a doorknob like right above my eye and seeing the big flash and all of this other type of stuff, but there's no photo of it, but I've still got that [00:28:00] memory. Having said that though, I don't actually know that that, I know that I hit my head, but the rest of the things around it, I don't know that that existed.
And so, is that data? Yeah. And I think Are it what you're getting into is the interpretation or the recall of? Yeah, you're talking about recalling that if I were gonna, if I were gonna bring that into the world as far as real data goes, I'd have to, I'd have to, you know, say it right now or I'd have to write it down or something to bring it into a digital format.
Yeah, it didn't have to be digital, I guess, but a format that would be considered data. But if I know that I'm likely to be wrong. Does it, does it count? Is that become biased? Uh, and you know, so that's why we have biased data because technically I don't think data is biased. It's our interpretation of data, which means the information that we glean [00:29:00] from the data.
So what you're doing is, uh, think about it more. I think we're getting more into the information, the presentation of data as opposed to the actual data point. Like, to me, data is a raw thing, it is, it is, and, and it's what we use to be able to remember. So you're saying, you're saying that, like, if I were to write it down, it would still be data, but it just could be biased.
Sure. And so, so, would you think, would you say that it's a response? That's an interpretation of the data. Right, right. For sure. But do you think it's the, it's the responsibility of the individual to, um, uh, to vet that? Like, to be able to vet and understand potentially biased data, I guess. Uh, to the best that we can.
And I think that has to continue to be a topic, now that artificial intelligence is like another being that has the ability to bias. Um, but again, it goes to the interpretation part of it. Uh, as an example. If I was to ask [00:30:00] you, um, what was the high temperature from four weeks ago on a Friday? Yeah. You can, you might be able to remember, oh, wait a minute, that was a day I was outside.
It was incredibly hot. I bet you it was like 97 that day. Yeah. I can tie it to an event, a memory, something. Or I could get on an app and I could literally go back to that and get the exact number. So which one do I trust more in that case? So we have a better way of storing it. If you go back a hundred years ago, we would.
You know, if we didn't have a way to record it, now we did, probably somebody wrote it down and things, but, um, so our interpretation or information, which means how we take data and show it as a representation, is incredibly important. It's actually the most important part of what we do. But that can also be extremely biased or skewed in one way, shape, or form.
And I think information is biased. Information can be skewed. Um, the data itself is needed to [00:31:00] bias something, but it's not the trigger for that. Um, for example, um, uh, go back to the, the, the, uh, traditional case of somebody getting a loan, and if you are male or female, the data is your male or female. But then people use that data and, and make an interpretation that you will, You can either pay it back or not pay it back based on your gender, which is wrong, right?
That's a, that's a biased opinion, but the data itself is just a yes or a no. Yeah, but what if the data says that, I mean, like you're saying that, but I think that we need to clear it up some more. Okay. Because it feels like we've stumbled into like a, uh, like a quagmire of, of like a existential metaphysical, like, uh, Uh, data interpretations and stuff like that, but the, so go back to the, to the analogy you're just making.[00:32:00]
Let's go through DIKW to do that. Okay. So, um, go ahead. So using that model, um, think of the data points that you have of somebody. If I'm the loan officer and I have to make a decision, I, uh, what data do I have? Uh, income, uh, Well, I mean, the important ones are, uh, credit score, debt to income, uh, payment history.
Credit score is an interpretation of my debt based on my debt, my income, things like that. And it, you would figure it would be in a set algorithm too, but it's not. No. Cause the, each, each of the, the, uh, the credit unions do it a little different way. Yeah. Um, so, um, and then you look at demographics, you look at income, you look at, um, Uh, um, uh, their job, how long they've been there, their debt, how many, how many credit cards they have, um, all of that, that data, it's then presented to [00:33:00] you in one sheet or in one profile.
That's the information. Okay. And now you need to make a prediction. Will they pay it back or not? Now we get into that knowledge component, right? Yeah. So, so you see how we went from the raw data to a presentation of the data. to decisions that we're about ready to make based on trends. So I'm going to look at it and say, you know, the debt to income ratio.
Yeah. Um, okay. They got too much debt compared to their income. Well, let me let me take a step back just to to fill in some of the pictures. So I've been a loan officer before, so this actually helps out. The very first thing you do once you've, uh, got somebody on the line is you get all of their personal information and then you go out and you pull everything you possibly can.
Sure. You ask for W 2s, you ask for, um, like, uh, uh, last three paychecks, you go find their deed, you make sure that they don't have extra liens on their house, uh, like there's all kinds of different stuff that you have to, you have to collect the data before [00:34:00] you can even take a step in the direction of processing a loan.
Okay. Go ahead. Uh, and so you do all of that. Do you ever get biased based on your interaction with them? With the individual? Yeah. Um, there's an inherent bias to selling loans anyways, and that's that you make money off of selling loans. And if you don't sell a loan, you don't make money. But if you met me versus somebody else.
And we had the exact same profile. Is there a possibility based on our personalities? Oh, yeah. Okay. See, that's new data. Yeah, I would just not give you anything. Well, because I'm not, uh, trustworthy because It's not trustworthy. It's just I don't want to talk to you. Oh, I see. Just dealing with you is, it's, it's a lot.
It is a lot. He's not messing folks. He's, there's, uh, there's some data that's in there. There's some metadata that I've got to pull out of that. Uh, but, uh, you know, I think that's where I want to go with our, our next part. And that is what is data [00:35:00] as like actual data points and basically like the categories.
Cause we know numbers are right. Yeah. We know that text definitely is now we have been, we may not have been a believer in that, but now with artificial intelligence and we're, Yeah. We're, we're okay with text now. Well, with, uh, the pre trained transformers, they learned how to turn words into numbers anyways.
Behind the scenes and it vectorizes the whole thing and all of that. Uh, we use audio, we use images now as, uh, stuff. But what else, uh, that a human brain looks at in terms, cause I, I like to make this distinction between humans and AI now. Yeah. Because we're, we're afraid that AI is going to make all the decisions in the future, right?
But what is it right now that the human brain can do, or the human can do, Yeah. That is gathering data that is hard for an [00:36:00] artificial intelligence to do? Um, we can apply, we by default apply data to experience. And so when we see a new point of data, like, it's, it's colored under our own experience. And so I think that that's one of the things that we're, we're more able to feed into AI now that we weren't able to do a year ago.
So I can, I can go in and I can, uh, you know, cause I've, I've done this for years. Like if I have a good experience or a good conversation or something like that, that I feel like is relevant, I'll write it down and keep up with it and whatnot. And, and. It's, it's tied with emotion, or it's, it's, it's, uh, colored with emotion, like this made me feel this, or this is what happened, and all this other type of stuff, and so there's a, there's a lot of just, non, I don't know what to call it, it says, it's non direct experiential, uh, data that takes, like, a data point that says one, And then you change it to 1.
5 because of [00:37:00] your own experience. And that's, that's not anything specific, but that's how we, uh, adapt to incoming data. And so the, the Gen AIs of the world now are getting to where they can help us do that as well. So you can go in and you can say, this is me. And here's the new information. What do you think about it?
And then it'll give you an opinion. And then you can also have your own opinion. And then you can, you can compare these two abstract opinions. In ways that you couldn't before and that's, that's kind of what I was talking about with that notebook LM is like it's taking data that I read and I have an opinion on and then it's given me its opinion on it based off of what my opinions were or like what my input was because my input is one of those data sources.
And so I'm tying my experience and my knowledge to an existing set of knowledge and then blending those things together in a way that creates something new. Yeah, so going back to our definition. Capturing of an imperfect view of reality. We are [00:38:00] trying our best to capture whatever it is. Like right now, you and I sitting here, what data is being captured?
From an artificial intelligence standpoint, we've got the words, we've got the video, we have images, um, we have the metadata behind all of that, you know, time of day, all of this other stuff. And then there's stuff that's hard to capture still for artificial intelligence. If I'm sitting here and you're talking and I look away, and I'm just, looks like I'm just staring off into space, you will interpret that in a certain way, right?
Um, I'm not saying that over time, artificial intelligence from video will not be able to capture that, but is it doing that now? Like, does it get the cues? Well, and I'm saying, It's getting better, but what is it that we're using, you and I are using to make decisions if I was to lean back, um, all of the body language, our, our experience in general, is it cold?
Is it hot in here? Are all things that go into [00:39:00] us making decisions? Yeah, it just does. And at the end of the day, a human being is trying to make a decision or pass it off onto artificial intelligence decision for them or How many times do you have a hard time making a decision so you let somebody else make that decision for you?
I mean, that's, that's what I strive for in life. Is to not make a decision? Yeah, as many, as few decisions as possible. Well, there's some things I, my wife will just make and other ones that I have to make. And there's times I wish she was making that instead of me. And so, um, it's all based on decisions and we as humans still use more data points.
Than our official intelligence. Oh, uh, okay. So I'm gonna push back on that just a little bit, but are, are they the correct data points? Uh, it is the, uh, no idea because last time I checked, we still make incorrect choices as people and, uh, um, and [00:40:00] why not only is data for artificial intelligence imperfect, but it is for us too.
And you know, I've, I've gotten to the point where lately everybody asks me a question and I just say it's a data problem. Because what's not a data problem? Um, If I have better data, I will make the right decision. If I don't make the right decision, I didn't have good data. Well, yeah, because the way you're tying it in the DIKW model to action and whatnot, I think that's, that's accurate.
If, you know, and again, not every decision in the world goes through this incredible four step process. What did I have for breakfast? Um, I just got home after a couple of days and, you know, we gotta go grocery shopping and, and that is what it is. I didn't have to go through this. Well, the best way for me to be optimal today is to do, you know, I didn't do any of that.
We try to do that inherently or just, uh, passively though. I think so. Uh, if I'm making a left or if I'm going somewhere and I think the place is left, [00:41:00] but I know I can still get there right, but it takes 20, 20 more minutes. I'm, I'm going to take the left. Yeah. And, and that's, and that's not because. You know, I had to ponder over why I would do all this other kind of stuff.
I just naturally go to the, to the, to the best choice. And, but machines don't know by default what the best choice are, or choices are. But, we all, like, I guess the point I'm trying to make is we always don't either. Because sometimes I think it's left, and it was actually right in the end. And I would have made a better choice by going right, even though, you know, I was just wrong.
That's the, the nature of being wrong. Well, and, today. What works for you is different than me. So you, you need a personalized decision versus what I would need. So you would say, Oh, I can't, uh, I can't look at screens past eight o'clock because I won't be able to sleep. And I'll be like, that doesn't bother me at all.
And, you know, we're wired differently and yes, artificial intelligence. When we get to a point where we have, um, digital twins and then we could [00:42:00] do all this other really cool stuff. Um, I don't think we're there yet. Um, and so I still feel like our experiences over time, like how many decisions do I make because of how I was brought up?
I think a lot. Yeah. Um, and, uh, will that be something that's captured? Should that be part of my decision? You know, think about that. Or if I'm making a biased decision because of where I'm from. Um, you know, and so we base decisions on more than just numbers and text. Yeah. Um, you know, we go through, um, I think the human brain, um, still processes things at a, uh, that we don't quantify yet.
Yeah. Well, I mean, I'm a parent, and I understand that I, I came to understand really quick rather that I held a lot of unconscious bias that I was not aware of. Like I was making decisions on how I raised my kids [00:43:00] that I'd never thought about before, but then when I stopped and looked at it, I was just like, Oh, I just skewed that way by default because that's what, that was my model or whatnot.
Your experiences dictate it. So experiences I think are still hard to capture. So data is still the same, how it's captured, how we process it. I think it's, it's not the experience necessarily. I think it's the impact of the experience. That's maybe more important, because like you said, like, if I write a set of sentences down, to me, they may be awful, and to you, it may be like, I don't care, it's whatever.
It doesn't affect me at all, but it could be the worst thing for me. Yeah, um, so let's do some comparing, because I get this question a lot between the human and artificial intelligence, and so I'm, I'm looking here, and, um, I went through and I tried to see if we could equate parameters. To the human brain, okay Um, and so when we say parameters chat GPT is sitting at what [00:44:00] 1.
5 to 1. 7 Trillion parameters and what a parameter basically means is anything That is an interaction between something so the word the in uh chat gpt has several thousands millions of interactions Yeah, words it's connected to. Yeah, words it's connected to. So every one of those is a parameter. So that's where you get up to the 1.
7 trillion. That's why it's such a large number. So you then have to equate that to neurons in the brain. And, um, it's hard to kind of figure that stuff out. So for again, sitting at about 1. 7 trillion for, uh, CHAT GPT, about the same for GemIIni now. GemIIni supposedly has more, uh, parameters. The neurons in your brain, you can equate it.
And this is me reading eight different articles, all of them way all over the place is between eight trillion and 30. Okay. [00:45:00] So for at 1. 7, it's at least eight. And so I'm just going to say right now, the human brain's parameters completely making this up. Yeah. Don't quote me on this is 10 times more than where chat GP is today.
Yeah. That that's just from a math perspective. Because like you said, like there's a lot of squishiness to what is consciousness and all this other type of stuff that we don't really, it could have an exponential impact on like how our brains compute. Yeah, and I'm just trying to go to how do we process data?
Yeah. And so how many parameters, how many interactions within our brain allow us to do that? Now, there's two things that go into a decision with artificial intelligence. It's going to be the number of parameters and the data. And the large data, you have to have a large data set and you have to have a lot of parameters.
If you have less of one, you have less of, uh, uh, you have a less accurate model. You have diminished decisions. There you go. Uh, diminished [00:46:00] impact. Diminished number of decisions. Yeah. So, um, so same thing for you, right? How much can you capture and remember that data that you have? Now you don't remember every single, um.
going back to, uh, you know, uh, zero BC and everything that's happened, but technically chat GPT good, right? If it's all been chronicled and stuff. So you have different data, but I have experiences, things that are very centric to me. So it's almost like we have these specific large language models and then we have at least 10 times the processing power today.
Um, of a chat GPT or, uh, Gemini, um, and so we're, we're different now, which one's better. I still feel that human is, um, we are biased, we are biased and, um, and the decisions for me will be better [00:47:00] from this human. Uh, then again, some of the decisions I've made lately, maybe I'm wrong, but you know, um, well, that's just intelligence.
Like when you're, when you. When you're thinking about intelligence, because, you know, I've thought about this before, it's, it's like, what is intelligence, and it's, it's, uh, to me it's as simple as intelligence is knowledge that leads to a number of, uh, options that equals more than average. So if I, if I've got a problem, if I can think of seven, uh, resolutions, then I'm better than somebody can think of one.
Now, is the end result going to be better? Maybe not, because maybe their one was the same as the best one that I came up with. Yeah. But then, uh, in more often, in more cases than not, it's going to be I'm going to have the best option. I'm going to have the most, um, I'm going to have the best option consistently because of the amount of knowledge that I have that leads to more decisions.
And so, to the point that you're making, and, and the point that, [00:48:00] uh, I was getting to earlier with bringing experience into the, uh, the, um, the codex, if you will, or the corpus, if you will, then you're, which means the data set itself. Yeah. The actual AI data set, then you're bringing like you're, you're not just bringing data as in like, yes, no, but you're bringing like potential mentorship into it.
So here's my situation. How could I handle it? Then AI has given you the seven different options that you may have only been able to think of one. And so from that point, like, what is intelligence? What is, like, what is data in that specific moment? Because you're melding, like I was saying earlier, you're melding your own personal data and experience with that of the internet.
Yeah. Or we'll just call it the internet for now. And so you're, you're being more, or you're becoming more. You know, you've definitely got more options than you had before. Yep. [00:49:00] Uh, just by interfacing with this, with this new data, if you will. And I think that's something, you know, I, I, I, I put into three different buckets.
You could either have the absolute truth Mm-Hmm. , which most of us don't know if we do or not Sure. Um, now, uh, four minus two. We know the absolute truth of that. Right. On Base 10 math we do . Yes. Uh, um. Or we have a ton of different versions, like you just said, seven different, uh, scenarios or seven different versions of the truth, and we picked the right one, or, which a lot of people have, is just a biased one, right?
And so we all know that we'd rather not have a biased one, but we do, a lot of times, make decisions on biased data, or biased interpretations of data, instead of the seven different things. Let's go with my easiest example. Um, and, and because I was just out of town, I was, um, up in Detroit, and I wanted to go to the best Coney Island.
If you go to Detroit, you go to restaurants [00:50:00] that are called Coney Islands. Okay. Um, so what does that mean? It's usually Greek style food. Okay. Um, but it'll be more like fast food style. So it'll be incredible breakfast, uh, omelets, uh, hash browns that are way too greasy. Okay. It's kind of like a, a Greek diner.
Yeah, yeah, yeah. Um, And then for lunch, you would get, uh, gyros, you know, gyros, uh, you know, um, uh, Greek salads. Um, and then a lot of times there'll be like moussaka, which is, uh, um, it's almost like a lasagna in Greek food. Now I'm hungry. Uh, now we'll have to get, get some, uh, we'll go, go, we can't find a Coney Island around here, but that's what they're called.
So there's Leo's, there's El George's, there's Kirby's, there's National, um, there's all these different Coney Islands. Well, which one am I going to go to? How would I get the data and where do I get that data? And so if I ask somebody that is loyal to one, has it hasn't gone to anybody but Leo's for the last 20 years, what are they going to tell me?
I mean, they're going to tell you Leo's they're going to go to Leo's. That's biased at [00:51:00] that point. Extremely. Yeah. But if I instead go to a place and I get seven different people and they say, Oh, go there because man, their salads are enormous. They have the best this. I love their dressing another one says oh, but if you want to get their spinach pie, it's amazing, you know And then so you get all these different Reviews now I can or I go to Yelp or something like that Now I'm making a decision based on multiple ones and for some people they just want the biased one and go on, right?
It's like just tell me Leo's you've been going there for 20 years. That's good enough for me Yeah, like if I, if I were to go to Detroit and you were saying if you don't, if you told me that I had to go to Leo's, I'd be like, well, I gotta go try it out because he likes it a lot. And, uh, but I would tell you the fact is, Al George's is the best.
Okay, well, obviously I would go to Al George's then. But, uh, but of course that's really a biased decision, right? I don't care. But that's, do you care? Sometimes you may not care, but then sometimes you do. In this case, I would want to try all of them over time if I [00:52:00] could, but if I'm in Detroit once and I had to go to one, where would you go?
And, um, now I'm picking El Georges. Why? It's on Michigan Avenue, five minutes from my parents house. It's where we go. I love their salad and my kids love it. And when you go in there, it looks like it's from 1970s. It's awesome. You know, it's just that throwback, uh, diner style. Uh, Leo's is great, but there's like 50 of them, you know, so it's more of a chain.
That's how hot chicken is in Nashville. That's right, yeah, so the Hattie B's versus the Prince's is a big, that's, that's a good, that's a good example too. And so, uh, same, same concept, always good. Leo's is always good. Yeah. But El George's is special to me. Yeah, that's, uh. Where we've gone for birthdays and all of, you know.
It's, it's in your, it's in your zeitgeist. It is. So, so I, I do think in terms of data, we make decisions based on those three. And a lot of times biased is okay. Yeah, that's fine. We have our good friend, our data storytelling [00:53:00] friend, Zach Gemignani. If he tells me a Italian restaurant is good, I'm going to believe him because he's Italian.
That's it. That's the only reason. That's the only data point I need. Uh, he also says he's a, uh, uh, uh, uh, Italian food snob and I believe him. Um, so I'll believe him and that's, uh, that's a biased decision, but that's okay. I didn't need to go and get everything. Yeah. It was really easy to for you. That's right.
And so for personal stuff, I think, um, that's one thing about data. The other one that I think is really important to always talk about is when we are making decisions, if we, it's new to us or we want to get better at it, it's good to keep and store the data. Yeah. So, like, when I first started getting into health and nutrition, I would, I would journal my, my food, how much I ate.
These days, I feel like I don't need to because I've created that computer. But it took me a long time to [00:54:00] get to that point. Yeah. To where I could tell you, I pretty much need a certain number of calories in a day, and I know based on how I eat if I'm getting those or not. Where a lot of people, that's hard to.
Do, but because I quantified it. You've effectively trained yourself though. I trained myself through the collection of data. Yeah, but that's uh, that's, it's important to call out because you didn't just, you didn't just um, magically come to know this after doing it for a little bit. Like you meticulously kept up with your food for years.
You said it was a decade, right? Easy, yeah. For over a decade, where you wrote numbers down, you, like you, like you said this before on the podcast, you can tell me how many grams of protein is in an eight ounce piece of chicken. Oh, definitely. 110. But once, and once you've got, once you've got all that information, like that's, that suffices for your particular needs, but it doesn't necessarily mean that, uh, you know, what you were doing before is irrelevant, or starting it again would be bad in any kind of way, shape, or form, because I think [00:55:00] that, Something that I've been focusing on personally is like, how can I generate as much data as possible for myself?
And so you mentioned earlier, journaling is the biggest thing. It's the biggest, or it's the most accessible way to create your own personal data. And now, with large language models, you can just spam out whatever you think raw, and leave it as is, and then put it into some sort of AI. And then you can ask yourself questions about yourself.
Yeah. You can do all kinds of crazy stuff that we couldn't do just two years ago. Mm-Hmm. . And so I, I'm looking at, I've been doing this for a while because I've always, uh, I say I've always, I I've, I've been under the belief that at some point all of our data will be exposed. Uh, like if I, if I do it or if I capture it or whatever, at some point it's gonna be a risk.
And so I, I like to live in a way that, um. [00:56:00] Like it's, I'm one to one, like I match, like if I say it, I do it. And if I don't talk about it, then I probably don't do it. But then, um, with the, uh, with the journaling and whatnot, it's, it's like now I can really just start creating as much as I want because I don't have to worry about getting it and building it up in any kind of way.
I can just interact with it. And so I think going forward, a data centric mindset. Not just like what can I create, but how does that, you know, how does that interact with my life? Like what is my high value data? Sure. And what, you know, what do I care about? And then once I know all of that stuff, then I can, you know, I can either decide to, um, you know, like dig deeper or just capture what I've got and that's enough.
Or like, when do I go in and check on myself? Or like, there's a whole bunch of, a whole bunch of new questions. That we've never really had to ask ourselves before because it wasn't [00:57:00] possible, whereas now it is.
So I'd kind of like to go into all the business aspects as well, but I think that would be another hour. And so maybe we should hold off on that, uh, for another episode. Uh, but Charlie, just give me some of your thoughts on, uh, what we've talked about today and some final, uh, tidbits on data. Yeah. So I think the important piece that we've.
that we've talked about today is that, like, what, what is data hasn't necessarily changed, but what is data that we have access to work with most certainly has. And so the, um, uh, the analogy about, uh, that we were talking about earlier is that we have more information to turn into knowledge now, uh, than we've ever had before kind of sticks out to me in that.
Artificial intelligence, I never really thought about how artificial intelligence could be tied to human intelligence. I've always thought about it as apart from, but I think that those two things are not necessarily too different now. Well, that's a good way of thinking of it. And so we can use our human intelligence and supplement it with artificial intelligence to become more intelligent individuals.
And that's one of the things that I think is, uh, is the coolest about this is that we, we're all smarter for having access to these things. And so now, the, the most important thing is getting the stuff that's in here, out here. And so like, I, I, I feel fancy for having journaled for as long as I have. And so I'm glad that I've done that, and it, because I'm gonna, I, I have the ability now to take that information and ask questions of myself.
And so, since I started, I started journaling in 2018 in earnest. And so, from 2018 to now, I can ask, I can ask, like, what did I think about this? Or what did I think about that? Or I'm thinking about doing this. What, you know, what experience do I have? Or whatever. You know, because this is only six years ago, but what if it was 20 years ago?
In 20 years, will I, you know, will what I wrote down in 2018 matter? I don't know. But the, the important thing is that I'll have it. And if it is important, Then it'll be available to me. Yeah. And, and data has to be stored and has to be processed. Mm-Hmm. . And so you have been storing data, um, from journaling.
And journaling is like, to me, it's like this, it's a different type of data. Mm-Hmm. It's, it's a reflection of what's happened. It's almost like your metadata Yeah. Is a, is a good way of thinking of it, as opposed to, you know, how much should I weigh on that day? That's a fact. Mm-Hmm. . Um, and that's a hard one, but.
You're adding the metadata from your daily, uh, tasks. Um, I, I really think that's interesting, but now, in the past, the only way for you to process that was to go back and read it. Yeah. And read it over, and nobody ever does that, right? Yeah, so like, uh, a good example is, um, you know, if I, if I journal consistently, and I journal about a family member consistently, I, you know, I talk about them or, or whatnot.
I can ask, like, how has my opinion of this person changed over the years? Do I seem generally positive or negative towards this person? Like, have they, do they affect me in any kind of negative way? And I could potentially pull insights that I couldn't have otherwise out of just me writing things down that has one meaning, but when you tie it to five other data points in the same avenue, it has a totally separate meaning.
Yeah, where you, uh Kind of depressed in one year versus another just based on your writings. Yeah, and like what commonalities Are in the depressed years? And then what, how does that differ from the non depressed years? You're, you're, you're talking about certain habits that you have in one year compared to another, and all of that metadata was something you could store in the past.
Now, with all of these new tools, we have ways to process it. Yeah. And so I think that's where I go with things is I always think of. The fact that we can, uh, we've been able to store things in an analog way in the past. Then we were able to store things in a digital way. And now we're doing a better job of storing things in a, uh, way that can be processed.
Um, not just facts and figures, but also the metadata that goes along with that. And, and so, um, I think that's, um, incredibly important. So, if you know the data, you're able to have better decisions. If you can have ways and let artificial intelligence be ways to process that data, it hopefully will lead to better decisions.
And that's the whole point of why we wanted to talk about data again today. Yeah, yeah. So, for everybody who's listening, the, the data that you're creating passively right now is great and all, but, uh, you do need to get that experience out. Yeah. While it's still relevant and valid and, and not a complete and total hallucination.
And on that, I think that's a good way to end today's. Program. So I again, Charlie, it's been fun talking data. I think we'll talk about data again a couple of times. Um, and so I just want to again thank everybody for your continued or for your first time of listening to the data for all podcast. If you like what you heard, please go ahead and subscribe to the podcast on your favorite podcast player.
Or the best way to get to know us is to go to our website, which is the data for all dot IO website where you can learn about us. To download any of our podcasts or again, subscribe on your favorite podcast player. Mm-Hmm. or everything we do in an audio version is also video. Yep. And, uh, you can see all of our YouTube, uh, uh, episodes and clips and things from that we've done, uh, from our YouTube channel on there.
And, uh, again, that's data, the number four, all.io. And again. Thanks again for listening to another amazing episode of the Data for All podcast. I'm Charlie Apigian. And again, I'm Charlie Yielding. And again, until next time.