How can we use computers to help us understand the world? video transcript

[Music] 

[Lewis Allsopp] Hello and welcome. This is ‘For a Better Tomorrow’, the University of Derby's Innovation and Research Podcast with me, Lewis Allsopp. Each episode, I chat to someone in Academia about what they're working on and how it'll make our lives better.

Now, previously, we've spoken to researchers answering questions like ‘why singing is so good for us?’ and ‘why inter-business relationships are so important to our daily lives?’, but today I want to talk data. Now, I know it sounds big and difficult but actually all data is at the end of the day is numbers and words. But how do we get these numbers and words and make them readable, you know so that you and I can understand them? Well to help me find out, I'm joined by Professor Farid Meziane, who's the head of Data Science theme at the University of Derby, as well as being a Professor of Data Science here. 

[Lewis Allsopp] It's lovely to speak to you. I mean, before we get into that, just tell me a bit about you first of all.

[Farid Meziane] It's for me just studying all the time basically. So, you finish your first degree and then you get a scholarship, you come to the UK, and you study for a PhD in Computer Science.  I did my PhD at the University of Salford that I completed in 1994. And then after a year of completing my research, I took three years and went to Malaysia, where I worked for three years there. And then around 1998, I came back to the University of Salford as a full lecturer, where I spent, I think, most of my time until I came to Derby in 2020. 

[Lewis Allsopp] And you're involved in the Theme Lead for Data Science at the University. What is Data Science first of all?

[Farid Meziane] I think I've been asked this question many many times. Some people will get it straightforward and some others you need some time to explain it. So, I would start with a simple example: So, let's say that you leave this room, and you walk in the corridor, and you find a piece of paper, and on that piece of paper you have list of digits (so sequence of digits, numbers). So, you pick it up, you look at it. It doesn't make any sense to you does it, so that's data. And then you look at it further and you realize that now it's 11 digits and it starts with 07, so you start giving it some structure. You think it's a phone number, at the end of the day, that you've got here so the data became information.

And then imagine you take, you are a bit curious, and you take your mobile phone, and you ring that number and on the other side somebody saying let's say “Hi I'm Michael, can I help you?” or something like that. So, suddenly that information becomes knowledge, and therefore, you can do a lot of things with knowledge. So, you associate a person with that phone number, so we move from data where you have something, but you don't make sense of it, and then if you try to organize it, it will become information and if you try to extract some knowledge from it, it can become knowledge and very powerful.  

So, data science is looking at all the data that we are producing, so whether it's through the newspapers or through, let's say, the weather agencies or the financial organisations and so on. And then, you have so much data that you'd say, ‘what can I learn from this data?’. So, you try then to write algorithms and those algorithms, what they do is they look at the hundreds of thousands if not millions of data items and then, they will try to find out whether there are relationships between the data. The most, probably, use of data science is trying to predict, so you try to learn from the data that you had in the past that you have collected for years and years and then you try to predict what would happen in the future.

[Lewis Allsopp] It's really interesting. It's almost like we're taking these almost like robotic, mechanical digits and numbers and we're making them human. I suppose that's the process isn't it; you know getting all of these numbers and making them palatable so that we can understand something from them. And you say predictions, I suppose like you say the weather is one of these things we're constantly looking at numbers and slight changes in different parameters, which we can then use to predict what's going to happen next, by what's happened before.  What are you looking into personally or what have you looked into? Talk to me about something that you perhaps are proud of I suppose.

[Farid Meziane] I think my expertise personally is looking at textual data, so rather than looking at numbers, they look at texts and try to understand what is in the text. So, this is sometimes referred to as Natural Language Processing (NLP), sometimes it's referred to as text mining, so you mine the text, you look into the text, and you try to extract knowledge. So, on this side, I would say that some of the work that I have done was on trying to understand the reports that are written by our Radiologists and then try to structure them try to extract the information and put them in such a manner. We noticed about I would five or six years ago that there is a lot of information that is in the Radiologists reports, but the way they are written makes it extremely difficult to organize the data or extract the data - they use different terms, they use abbreviations, they've used some of the vocabulary that is not standards, there are many typing errors and so on - and because this is a task that they do while they are looking at the physical process, I would say. So, they just scribble things, but that report is very important. That's the report that will go to your GP or another specialist in the field, and it's based on that report that some diagnoses are going to be made and also the treatment for that particular disease.  

So, what we did was we wrote a program that was able to read the text and understand what the text is about out, and then organise it into let's say these are some of the findings from the Radiology, that are related to this part of the body or that, and this is what the radiologist has seen. And then from there, once you organize this data then you will be able to apply those algorithms that you have mentioned before and then get some knowledge from there.

[Lewis Allsopp] So, I suppose we're looking at, because obviously you know working with humans is never easy, we're all different.

[Farid Meziane] That's right.  

[Lewis Allsopp] So, as these people and as these experts are writing things down on the job, they've all got their own way of doing things, haven't they? Their own abbreviations and little nuances. And I suppose it's like you say, deconstructing all of those bits, putting stuff in groups and then having a more formatted version of what they've written to then be able to take some understanding from it. How does that help a GP? 

[Farid Meziane] It's not the GP itself that it’s going to help, but remember we learn a lot from data. So, let's say when you look to you go to your bank or let's say within the University, we have what we call databases. In the database, you have structured data, so you have the name, you have the date of birth, you have the address, you have let's say for students the marks or the results in A Levels and so on. The data is structured, so when you try to extract knowledge it's very easy. When you have text, you don't have that particular field where you want to look and see whether it's written or filled so by doing this, we are structuring the data, because one of the problems that we did face before was that with Radiologists, they did not like a structured report. Basically, where you say okay you put the name here, you put your findings here, you put your organ here, you put the size of the stones if let's say, they just want to write because either they are on the time pressure or they don't like that kind of structures. But for us data scientists, it's very hard to work on that type of data so we need to give it some kind of structure. So, it's this aim or if it's for this purpose that we have done that work, so the GP is not, it’s not going to make a big difference to the GP but it's going to make a big difference for researchers for finding relationships between those kind of findings that we find in the report, and why not link that information from there to the electronic patient record that is widely used these days and then try to predict few things. Maybe that patient is more likely to develop a particular disease, maybe that patient is likely to have complications, and this is something that we learned from previous data. So, somehow it may help the GPs because it's giving him or her some information that is not available because the computer programme will go through thousands of records and say look, they’re quite [inaudible], because if patients have this and the patients have these kinds of characteristics, then it’s more likely that they are going to develop this particular disease or condition.  

[Lewis Allsopp] It’s really interesting because it’s genuine, you know we often talk about research and innovation and it can sometimes be a little bit wishy-washy at the start, but actually, there’s a real-world use for this. You know, we can help to predict things and I suppose that’s all your work, you know, that’s what you work is aiming to do. It’s aiming to predict things and give us a better understanding of things. Obviously, this has, you know, a multitude of real-world applications. It’s useful to be able predict anything. And technology is advancing all the time as we know, and you’re probably a part of that advancement to be fair.  

What would you like to see from technology? What would you like to be able to do? Is there anything that you can't do now, or you haven't quite cracked now? What would you like to see from technology as a whole, to help this further?

[Farid Meziane] I think it's the technology we are living in, let's say as Computer Scientists or Data Scientist, at the moment, in a very kind of interesting times where we have powerful computers. So, if you look at 50, 60, 70 years ago (when I started doing my research), you know it would be very lucky to get small discs where you have 512 kilobytes of data that you can store on it and the disc goes quite big. And in this case, anyone is going to have one terabyte of stick on in his hands and he can go and basically save any data that you want or use it for music, or for movies, or anything like that. So, we are very lucky to have reached this point where power is not an issue, we have supercomputers. Storage of data is not an issue, so we can store as much data as we could and therefore, we have or we just want to develop a strong algorithm, powerful algorithms and then in data science, we call and we talk about precise algorithms because when you develop this kind of systems, they are not always 100 per cent correct. They are going to give you let's say 70 per cent, 80 per cent, 90 per cent and so on. So, depending on the area of application sometimes you will be happy with 80 per cent precision, and sometimes you don't. So, if you are working in health, for example, you don't want a system that is going to get 20 per cent of the cases wrong about a particular patient.

[Lewis Allsopp] Obviously, we were speaking about you know this advancement of technology and the fact that we've got lots of power in our computers now and lots of storage. Given all of that, I know that these things are going to take time, but do you think it's possible, do you know I mean, to get this precision that we need for cases like you know seeing if someone's got cancer or not?

[Farid Meziane] Yeah if you're following the news, last year Google, because they're very powerful at the moment and to be honest with you particularly when it comes to language and text, they are probably the strongest company in the world, and this is where most of the research is at the moment. They have developed a system for detecting breast cancer and that system was performing better than a specialist, than a human being. So, this is how far we were. There are areas where computer programs are performing better than specialists.

[Lewis Allsopp] That's really amazing.

[Farid Meziane] A colleague of mine, while I was at Salford had a prostate cancer, had a tumour, and the whole procedure was done through a robot, so there was I mean, probably Guided by a human being, but everything was that the robot. So, when he came, he just told me “I have seen artificial intelligence in practice, I've been operated by a robot”.

[Lewis Allsopp] That's amazing.  

[Farid Meziane] So that's the kind of things that I think very lucky that we have reached the stage where you can trust the technology. Of course, it's not perfect, like everything else, it has got its pluses and its minuses and Data Science is not an exception to the rule.

[Lewis Allsopp] Listen, it's been really interesting to speak to you. It's an area of the world that really interests me, so thank you very much for chatting to me and good luck with everything. 

[Farid Meziane] Thank you very much. Thank you for having me. Thank you. 

[Lewis Allsopp] That is Professor Farid Meziane, who's the Head of Data Science theme at the University of Derby and a Professor of Data Science. And that is it for this episode of ‘For a Better Tomorrow’ the University of Derby's Innovation and Research Podcast.  

In other episodes, I've looked at the research answering questions like’ how looking at events of the past can inform our current decision making’ and ‘why mirrors are helping us to make solar energy more efficient’ so, be sure to check them out wherever you get your podcasts and follow the University @DerbyUni. I'll see you next time. Bye-bye.

[Music]

[Lewis Allsopp] ‘For a Better Tomorrow’ was presented by me, Lewis Allsopp and produced by myself and Dr Daithí McMahon in the School of Arts for the University of Derby. 

How can we use computers to help us understand the world? video

Back to For a Better Tomorrow: The University of Derby's Innovation and Research Podcast