Microsoft Research India Podcast – Podcast (2024)

Microsoft Research India Podcast – Podcast (1)

India · Microsoft Research India

  • Technologie

A technology and research podcast from Microsoft Research India

Afleveringen

  • Microsoft Research India Podcast – Podcast (3)
    Evaluating LLMs using novel approaches. With Dr. Sunayana Sitaram20 mei· Microsoft Research India Podcast

    [Music]

    Sunayana Sitaram: Our ultimate goal is to build evaluation systems and also other kinds of systems in general where humans and LLMs can work together. We're really trying to get humans to do the evaluation, get LLM's to do the evaluation, use the human data in order to improve the LLM. And then just this continues in a cycle. And the ultimate goal is, send the things to the LLM that it's good at doing and send the rest of the things that the LLM can't do to humans who are like the ultimate authority on the evaluation.

    Sridhar Vedantham: Welcome to the Microsoft Research India podcast, where we explore cutting-edge research that’s impacting technology and society. I’m your host, Sridhar Vedantham.

    [Music]

    Sridhar Vedantham: LLM's are perhaps the hottest topic of discussion in the tech world today. And they're being deployed across domains, geographies, industries and applications. I have an extremely interesting conversation with Sunayana Sitaram, principal researcher at Microsoft Research about LLMs, where they work really well and also challenges that arise when trying to build models with languages that may be under resourced. We also talk about the critical work she and her team are doing in creating state-of-the-art methods to evaluate the performance of LLMs, including those LLMs that are based on Indic languages.

    Related

    Microsoft Research India Podcast: More podcasts from MSR IndiaiTunes: Subscribe and listen to new podcasts on iTunesAndroidRSS FeedSpotifyGoogle PodcastsEmail

    [Music]

    Sridhar Vedantham: Sunayana, welcome to the podcast.

    Sunayana Sitaram: Thank you.

    Sridhar Vedantham: And I'm very excited to have you here because we get to talk about a subject that seems to be top of mind for everybody right now. Which is obviously LLMs. And what excites me even more is I think, we're going to be talking about LLMs in a way that's slightly different from what the common discourse is today, right?

    Sunayana Sitaram: That's right.

    Sridhar Vedantham: OK. So before we jump into it, why don't you give us a little bit of background about yourself and how you came to be at MSR?

    Sunayana Sitaram: Sure. So it's been eight years now since I came to MSR. I came here as a postdoc after finishing my PhD at Carnegie Mellon. And so yeah, it's been around 15 years now for me in the field, and it's been super exciting, especially the last few years.

    Sridhar Vedantham: So, I'm guessing that these eight years have been interesting, otherwise we won't be having this conversation. What areas of research, I mean, have you changed course over the years and how is that progressed?

    Sunayana Sitaram: Yeah, actually, I've been working pretty much on the same thing for the last 15 years or so. So I'll describe how I got started. When I was an undergrad, I actually met the principal of a blind children's school who himself was visually impaired. And he was talking about some of the technologies that he uses in order to be independent. And one of those was using optical character recognition and text to speech in order to take documents or letters that people sent him and have them read out without having to depend on somebody. And he was in Ahmedabad, which is where I grew up. And his native language was Gujarati. And he was not able to do this for that language. Whereas for English, the tools that he required to be independent were available. And so, he told me like it would be really great if somebody could actually build this kind of system in Gujarati. And that is when it sort of it was like a, you know, aha moment for me. And I decided to take that up as my undergrad project. And ever since then, I've been trying to work on technologies trying to bridge that gap between English and other languages- under resourced languages. And so, since then, I've worked on very related areas. So, my PhD thesis was on text to speech systems for low resource languages. And after I came to MSR I started working on what is called code switching, which is a very common thing that multilinguals all over the world do. So they use multiple languages in the same conversation or sometimes even in the same sentence. And so you know, this was a project called Project Melange that was started here and that really pioneered the code switching work in the research community in NLP. And after that it's been about LLMs and evaluation but again from a multilingual under resource languages standpoint.

    Sridhar Vedantham: Right. So I have been here for quite a while at MSR myself and one thing that I always heard is that there is this in general, a wide gulf in terms of the resources available for a certain set of languages to do say NLP type work. And the other languages is just the tail, it's a long tail, but the tail just falls off dramatically. So, I wanted you to answer me in a couple of ways. One is, what is the impact that this generally has in the field of NLP itself and in the field of research into language technologies, and what's the resultant impact on LLMs?

    Sunayana Sitaram: Yeah, that's a great question. So, you know the paradigm has shifted a little bit after LLM's have come into existence. Before this, so this was around say a few years ago, the paradigm would be that you would need what is called unlabeled data. So, that is raw text that you can find on the web, say Wikipedia or something like that, as well as labeled data. So, this is something that a human being has actually sat and labeled for some characteristic of that text, right? So these are the two different kinds of texts that you need if you want to build a text based language model for a particular language. And so there were languages where, you know, you would find quite a lot of data on the web because it was available in the form of documents or social media, etc. for certain languages. But nobody had actually created the labeled resources for those languages, right? So that was the situation a few years ago. And you know the paradigm at that time was to use both these kinds of data in order to build these models, and our lab actually wrote quite a well-regarded paper called, ‘The State and Fate of Linguistic Diversity and Inclusion’, where they grouped different languages into different classes based on how much data they had labeled, as well as unlabeled.

    Sridhar Vedantham: Right.

    Sunayana Sitaram: And it was very clear from that work that, you know only around 7 or 8 languages of the world actually can be considered to be high resource languages which have this kind of data. And most of the languages of the world spoken by millions and millions of speakers don't have these resources. Now with LLMs, the paradigm changed slightly, so there was much less reliance on this labeled data and much more on the vast amount of unlabeled data that exists, say, on the web. And so, you know, we were wondering what would happen with the advent of LLMs now to all of the languages of the world, which ones would be well represented, which ones wouldn't etc. And so that led us to do, you know, the work that we've been doing over the last couple of years. But the story is similar, that even on the web some of these languages dominate and so many of these models have, you know, quite a lot of data from only a small number of languages, while the other languages don't have much representation.

    Sridhar Vedantham: OK. So, in real terms, in this world of LLMs that we live in today, what kind of impact are we looking at? I mean, when you're talking about inequities and LLMs and in this particular field, what's the kind of impact that we're seeing across society?

    Sunayana Sitaram: Sure. So when it comes to LLMs and language coverage, what we found from our research is that there are a few languages that LLMs perform really well on. Those languages tend to be high resource languages for which there is a lot of data on the web and they also tend to be languages that are written in the Latin script because of the way the LLMs are designed currently with the tokenization. And the other languages, unfortunately there is a large gap between the performance in English and other languages, and we also see that a lot of capabilities that we see in LLMs in English don't always hold in other languages. So a lot of capabilities, like really good reasoning skills, etc, may only be present in English and a few other languages, and they may not be seen in other languages. And this is also true when you go to smaller models that you see that their language capabilities fall off quite drastically compared to the really large models that we have, like the GPT 4 kind of models. So when it comes to real world impact of this, you know, if you're trying to actually integrate one of these language models into an application and you're trying to use it in a particular language, chances are that you may not get as good performance in many languages compared to English. And this is especially true if you're already used to using these systems in English and you want to use them in a second language. You expect them to have certain capabilities which you've seen in English, and then when you use them in another language, you may not find the same capabilities. So in that sense, I think there's a lot of catching up to do for many languages. And the other issue also is that we don't even know how well these systems perform for most languages of the world because we've only been able to evaluate them on around 50 to 60 or maybe 100 languages. So for the rest of the 6000ish languages of the world, many of which don't even have a written form, most of which are not there on the web. We don't even know whether these language models are, you know, able to do anything in them at all. So I think that is another, you know, big problem that is there currently.

    Sridhar Vedantham: So, if you want to change the situation where we say that you know even if you're a speaker of a language that might be small, maybe say only two million speakers as opposed to a larger language that might have 100 million or 200 million speakers. How do we even go about addressing inequities like that because at some level it just seems unfair, that for no fault of their own, you know, large sections of population could be excluded from the benefits of LLM's, right? Because there could be any number of languages in which the number of speakers might be, say, 1,000,000 or 100,000.

    Sunayana Sitaram: Right. I think that's a very hard question. How to actually involve language communities into our efforts, but do that at scale, so that we can actually involve all language communities, all cultures, etc. into the whole building process. So we've had some success with doing this with some language communities. So there is a project called ELLORA in MSR India that you know Kalika leads where you know they work with specific language communities, try to understand what the language communities actually need and then try to co-build those resources with them. And so you know in that sense, you know, working directly with these language communities, especially those that have a desire to build this technology and can contribute to some of the data aspects, etc. That's definitely one way of doing things. We've also done some work recently where we've engaged many people in India in trying to contribute resources in terms of cultural artifacts and also evaluation. And so you know, we're trying to do that with the community itself, with the language community that is underrepresented in these LLMs, but doing that at scale is the challenge to try and really bring everyone together. Another way of course, is just raising awareness about the fact that this issue exists, and so I think our work over the last couple of years has really, you know, moved the needle on that. So we've done the most comprehensive multilingual evaluation effort that exists both within the large models as well as across different sized models which we call Mega and Megaverse respectively.

    Sridhar Vedantham: So if I can just interrupt here, what I'd like is if you could, you know, spend a couple of minutes maybe talking about what evaluating an LLM actually means and how do you go about that?

    Sunayana Sitaram: Sure. So when we talk about evaluating LLM's right, there are multiple capabilities that we expect LLMs to possess. And so, our evaluation should ideally try to test for all of those different kinds of capabilities. So, this could be the ability to reason, this could be the ability to produce output that actually sounds natural to a native speaker of the language. It could be completing some particular task, it could be not hallucinating or not making up things. And also of course, responsible AI, you know, metrics. So things like, you know, being safe and fair, no bias, etc. Only if all of those things work in a particular language, can you say that, that LLM actually works for that language. And so there are several dimensions that we need to consider when we are evaluating these LLMs.

    [Music]

    Sridhar Vedantham: Before I interrupted you, you were talking about certain projects that you were working on, which are to do with evaluating LLMs, right? I think there's something called Mega there's something called Megaverse. Could you tell us a little bit about those and what exactly they do?

    Sunayana Sitaram: Sure. So Mega project we started when ChatGPT came out basically. And the question that we were trying to answer was how well these kinds of LLMs perform on languages of the world. So with Mega what we did was, we took already existing open source benchmarks that tested for different kinds of capabilities. So some of them were question-answering benchmarks. Some of them were testing for whether it can summarize text properly or not. Some of them were testing for other capabilities like reasoning etc. And we tested a bunch of models across all of these benchmarks and we covered something like 80 different languages across all these benchmarks. And our aim with Mega was to figure out what the gap was between English and other languages for all of these different tasks, but also what the gap was between the older models, so the models pre LLM and LLMs. Whether we've become better or worse in terms of linguistic diversity and performance on different languages in the new era of LLMs or not. And that was the aim with Mega.

    Sridhar Vedantham: Sorry, but what was the result of it? Have we become better or not?

    Sunayana Sitaram: Yeah. So we have mixed results. So for certain languages, we are doing quite well, but for some languages, unfortunately the larger models don't do as well as some of the older models used to do. And the older models used to be specialized and trained on labeled data as I said in the beginning, right? And that would help them also be better at all the languages under consideration, whereas with the LLMs we were not really using labeled data in a particular language to train them. And so, we found that, you know, in some cases, the performance of English had shot up drastically and so the gap between English and other languages had also increased in the LLM’s case.

    Sridhar Vedantham: Ok. So, the performance of English, I mean the LLMs, that's much better than what there was earlier, but the other languages didn't manage to show the same performance increase.

    Sunayana Sitaram: That's right. They didn't always show. Some of them did, some of the higher resource languages written in the Latin script, for example, did perform quite well, but some of the others did not.

    Sridhar Vedantham: OK. And after Mega, then what happened?

    Sunayana Sitaram: Yes. So with Mega, we were primarily trying to evaluate the GPT family of models with the older generation of models as I mentioned. But then we realized by the time we finished the work on Mega, there was a plethora of models that came out. So there's Llama and, you know other models by competitors as well as smaller models, the SLMs, you know, like the Llama sized models, Mistral etc, right. So, there were all of these different models. And then we wanted to see across different models, especially when you're trying to compare larger models with smaller models, how do these trends look? And that is what we call Megaverse, where we do all of these different evaluations, but not just for the GPT family, but across different models. And what we found in Megaverse were the trends were similar that there were some languages that were doing well, some of the other lower resource languages, especially the ones written in other scripts, were not doing so well. So, for example, the Indian languages were not doing very well across the board. But we also found that the larger frontier models, like the GPT model, they were doing much better than the smaller models for multilingual. And this is again something that you know was shown for the first time in our work that there is this additional gap when you have this large model and small model and there are important practical implications of this. So, say you're trying to integrate the small model into your workflow as a startup or something like that in a particular language then because it is cheaper it is much more cost efficient, etc, you may not get the same performance in non-English languages as you would get with the larger model right? So that has an impact in the real world.

    Sridhar Vedantham: Interesting. And how do you draw this line between what constitutes a large language model and what constitutes a small language model? And I'm also increasingly hearing of this thing called a tiny language model.

    Sunayana Sitaram: That's right. Yeah. So the large language models are the GPTs, the Geminis, you know, those kinds of models. Everything else, we just club as a smaller language model. We don't really draw a line there. I haven't actually seen any tiny models that do very well on multilingual. They're tiny, because they are, you know, trained on a smaller set of data, they have fewer parameters and typically we haven't seen too many multilingual tiny models so we haven't really evaluated those. Although there is a new class of models that have started coming up, which are language specific models. So, for example a lot of the Indic model developers have started building specialized model for one language or a small family of languages.

    Sridhar Vedantham: OK, so going back to something you said earlier, how do these you know kind of models that people are building for specific Indian languages actually work or perform, given that, I think we established quite early in this podcast that, these are languages that are highly under resourced in terms of data to build models.

    Sunayana Sitaram: That's right. So I think it's, it's not just a problem of them being under resourced, it's also that the proportion of data in the model for a particular language that is not English, say Hindi or Malayalam or Kannada, is very tiny compared to English. And so there are ways to actually change this by doing things with the model after it has been trained. So this is called fine tuning. So what you could do is you could take, say, an open source model which is like a medium sized or a small model and then you could fine tune it or specialize it with data in a particular language, and that actually makes it work much better for that particular language because the distributions shift towards the language that you're targeting. And so, it's not just, you know, about the amount of data, but also the proportion of data and how the model has been trained in these giant models that cover hundreds of languages in a single model versus, you know, having a model that is specialized to just one language which makes it do much better. So these Indic models we have found actually do better than the open source models that they were built on top of, because now they have been specialized to a particular language.

    Sridhar Vedantham: Ok. I know that your work focuses primarily on the evaluation of LLM's, right? There must be a lot of other people who are also doing similar work in terms of evaluating performance of LLM on different parameters. How do you differentiate your work from what others are doing?

    Sunayana Sitaram: Yeah, that's a great question. So we've been doing evaluation work pre LLM actually. We started this a few years ago. And so we've actually done several evaluation projects. The previous one was called LITMUS where we were trying to evaluate without even having a benchmark in a particular language, right? And so we've built up a lot of expertise in how to do evaluation, and this has actually become a very hard problem in the LLM world because it's becoming increasingly difficult to figure out what the strengths and weaknesses are of these LLMs because of how they're built and how they behave, right. And so I think we bring in so much rich evaluation expertise that we've been able to do these kinds of, you know, Mega evaluations in a very systematic way where we've taken care of all of the we've taken care of all of the hanging loose threads that otherwise others don't take care of. And that is why we managed to do these comprehensive giant exercises of Mega and Megaverse and also got these clear trends from them. So in that sense I would say that our evaluation research is very mature and we've been spending a lot of time thinking about how to evaluate which is unique in our group.

    Sridhar Vedantham: OK, so one thing I've been curious about for a while is there seems to be a lot of cultural and social bias that creeps into these models, right? How does one even try to address these issues?

    Sunayana Sitaram: That's right. So, I think over the last few months, building culture, specific language models or even evaluating whether language models are appropriate for a particular culture, etc, that has become a really hot topic. Because people have started seeing that, you know, most of these language models are a little tilted towards western protestant and, rich, industrialized kind of worldviews and the values that they encode may not be appropriate for all cultures. And so there have been some techniques that we've been working on in order to again shift the balance back into other target cultures that we want to fine tune the model for, so again, you know, you could take data that has characteristics of a particular culture, values of a particular culture, and then do some sort of fine tuning on a model in order to shift its distribution more towards a target culture. There are techniques that are coming to be for these kinds of culture specific language models. However, I still think that we are far away from a comprehensive solution, because even defining what culture is and what constitutes, you know, say an Indic culture LLM, I think that's a really hard problem. Because culture is complex and there are so many factors that go into determining what culture is, and also it's deeply personal. So, each individual has their own mix of factors that determine their own culture, right? So, generalizing that to an entire population is also quite hard, I think to do. So, I think we're still in the very initial stages in terms of actually figuring out how well aligned these models are to different cultures and also trying to sort of align them to any specific target cultures. But it is a hot topic that a lot of people are currently working on.

    Sridhar Vedantham: Yeah. You know, while you're talking and giving me this answer, I was thinking that if you're going to go culture by culture, first of all, you know, what is the culture, what are you doing about subcultures and how many cultures are there in the world, so I was just wondering how it's going to even work in the long term? But I guess you answered the question by saying it's just starting. Now let's see how it goes.

    Sunayana Sitaram: Absolutely.

    Sridhar Vedantham: It's a very, very open canvas right now, I guess.

    Sunayana Sitaram: Yeah.

    Sridhar Vedantham: Sunayana, you know you've been speaking a lot about evaluation and so on and so forth and especially in the context of local languages and smaller languages and Indic languages and so on. Are these methods of evaluation that you talk about, are they applicable to different language groups and languages spoken in different geographies too?

    Sunayana Sitaram: Absolutely. So in the Mega and Megaverse work, we covered 80 languages and many of them were not Indic languages. In fact, in the Megaverse work, we included a whole bunch of African languages as well. So the techniques, you know, would be applicable to all languages for which we have data for which data exists on the web. Where it is challenging is the languages that are only spoken, that are not written, and languages for which there is absolutely no data or representation available, on the web, for example. So, unfortunately, there aren't benchmarks available for those languages, and so we would need to look at other techniques. But other than that, our evaluation techniques are for, you know, all languages, all non-English languages.

    [MUSIC]

    Sridhar Vedantham: There is something that I heard recently from you which again I found extremely interesting. It's a project called Pariksha, which I know, in Hindi and derived from Sanskrit, basically means test or exam. And I remember this project because I'm very scared of tests and exams, and I've always been from school. But what is this?

    Sunayana Sitaram: Yes, Pariksha is actually quite a new project. It's under the project VeLLM that is on universal empowerment with Large Language Models and Pariksha is something that we are super excited about because it's a collaboration with Karya, which is an ethical data company that was spun off from MSR India. So what we realized a few months ago is that you know there is just so much happening in the Indic LLM space and there are so many people building specialized models either for a single language or for a group of languages like Dravidian languages, for example. And of course, there are also the GPTs of the world which do support Indian languages as well, right. So now at last count, there are something like 30 different Indic LLMs available today. And if you're a model builder, how do you know whether your Indic LLM is good or better than all of the other LLMs? If you're somebody who wants to use these models, how do you know which ones to pick for your application? And if you're a researcher, you know, how do you know what the big challenges are that still remain, right?

    Sridhar Vedantham: Right

    Sunayana Sitaram: And so to address this, of course, you know, one way to do this is to do evaluation, right. And try to figure out, you know, compare all these models on some standard benchmarks and then try to figure out which ones are the best. However, what we found from our work with Mega and Megaverse is that the Indian language benchmarks unfortunately are usually translations of already existing English benchmarks and also many of them are already present in training data of these large models, which means that we can't use the already existing benchmarks to get a very good idea about whether these Indic LLMs are culturally appropriate, whether they capture linguistic nuances in the Indian languages or not, right. So we decided to sort of reinvent evaluation for these Indic LLMs and that's where Pariksha came in. But then how do we scale if we want to actually, you know, get this kind of evaluation done and we were looking at human evaluation to do this, right. And so, we thought of partnering with Karya on this. Because Karya has reached in all the states in India and they have, you know, all of these workers who can actually do this kind of evaluation for different Indian languages. And so, what Pariksha is, it's a combination of human evaluation as well as automated evaluation. And with this combination we can scale and we can do thousands and thousands of evaluations, which we have already done actually on all of these different models. And so this is the first time actually that all of the different Indic LLMs that are available are being compared to each other in a fair way. And we are able to come up with a leaderboard now of all of the Indic models for each Indic language that we are looking at. So that's what Pariksha is. It's quite a new project and we've already done thousands of evaluations and we are continuing to scale this up even further.

    Sridhar Vedantham: So how does someone, you know, if I have a LLM of my own in any given Indic language, how do I sign up for Pariksha, or how do I get myself to be evaluated against the others?

    Sunayana Sitaram: Yeah. So you can contact any of us for that, the Pariksha team. And we will basically include this model, the new model into the next round of evaluation. So what we do with Pariksha is we do several rounds. So we've already finished a pilot round and we're currently doing the first round of evaluations. So we would include the new model in the next round of evaluations. And you know, as long as it's an open source model or there is an API access available for that model, we can evaluate the model for you. We are also planning to release all the artifacts from Pariksha, including all the evaluation prompts. So even if it is a closed model, you can use these to do your own evaluation as well later to figure out how you compare with the other models on the leaderboard.

    Sridhar Vedantham: Right. Quick question. When you say that you're working with Karya, and you also say that you're looking at human evaluation along with the regular methods of evaluation. Why do you need human evaluation at all in these situations? It's just simpler to throw everything into a machine and let it do the work?

    Sunayana Sitaram: Yeah, that's a great question. So we did some work on, you know, making machines evaluators. So basically asking GPT itself to be the evaluator and it does a very good job at that. However, it has some blind spots. So we found that GPT is not a very good evaluator in languages other than English. Basically, it's not a good evaluator in the languages that it doesn't do well in otherwise and so using only automated techniques to do evaluation may actually give you the wrong picture. It may give you the wrong sort of trends, right? And so we need to be very careful. And so our ultimate goal is to build evaluation systems and also other kinds of systems in general where humans and LLMs can work together. And so the human evaluation part is to have checks and balances on the LLM evaluation part. Initially, what we are doing is we're getting the same things evaluated by the human, and the LLM is doing the exact same evaluation. So we have a point by point comparison of what the humans are saying and what the LLM is saying so that we can really see where the LLM goes wrong, right. Where it doesn't agree with humans. And then we use all of this information to improve the LLM evaluator itself. So we're really trying to get humans to, you know, do the evaluation, get LLM's to do the evaluation, use the human data in order to improve the LLM. And then just this continues in a cycle. And the ultimate goal is, send the things to the LLM that it's good at doing and send the rest of the things that the LLM can't do to humans who are like the ultimate authority on the evaluation. So it's like this hybrid system that we are designing with Pariksha.

    Sridhar Vedantham: Interesting. OK, so I know we are kind of running out of time. My last question to you would be, where do you see evaluation of LLM's and where do you see your work going or progressing in the near future?

    Sunayana Sitaram: So evaluation for me is a path to understanding what these systems can do and cannot do and then improving them, right. So our evaluations are always actionable. So we try to figure out why something is not working well. So even in the Mega paper, we had lots of analysis about what factors may lead to, you know, lower performance in certain languages, etc. So I see all of this as providing a lot of rich information to model developers in order to figure out what the next steps should be, how they should be designing the next generation of models and I think that has already happened. It's, you know, systems have already improved from the time we started working on Mega and a lot of the issues that we pointed out in Mega, like tokenization etcetera, now they are well known in the field and people are actually taking steps in order to make those better in these language specific models etc. So I see the work as being, you know, first of all raising awareness about the problems that exist, but also providing actionable insights on how we could improve things. And with Pariksha also the idea is to release all the artifacts from our evaluation so that Indic model developers can use those in order to improve their systems. And so I see that you know better evaluation will lead to better quality models. That's the aim of the work.

    Sridhar Vedantham: Sunaya, thank you so much for your time. I really had a lot of fun during this conversation.

    Sunayana Sitaram: Same here. Thank you so much.

    Sridhar Vedantham: Thank you.

  • Microsoft Research India Podcast – Podcast (4)
    HyWay: Enabling Mingling in the Hybrid World. With Dr. Venkat Padmanabahan and Ajay Manchepalli21 aug 2023· Microsoft Research India Podcast

    Podcast- HyWay: Enabling Mingling in the Hybrid World

    Ajay Manchepalli: One thing we have learned is that, you know as they say, necessity is the mother of invention. This is a great example of that because it's not that we didn't have remote people before. And it's not that we didn't have technology to support something like this. But we have had this Black Swan moment with COVID, which required us to be not in the same physical location at all time and that accelerated the adoption of digital technologies. You can build all the technology you want. But having it at the right time and right place matters the most.

    [Music]

    Sridhar Vedantham: Welcome to the Microsoft Research India podcast, where we explore cutting-edge research that’s impacting technology and society. I’m your host, Sridhar Vedantham.

    [Music]

    Sridhar Vedantham: The COVID pandemic forced most of us into a new paradigm of work from home and a number of tools to cater to remote work became popular. However, the post pandemic environment has seen interesting scenarios with some people preferring to continue to work from home, some people preferring to return full time to work and a number of people adopting something in between. This hybrid work environment exists today in the workplace as well as in other scenarios such as events. While tools such as Microsoft Teams do extremely well in supporting scheduled and agenda driven work meetings, there is need for a tool that supports a mix of virtual and in-person gatherings in an informal or semi-structured work environment, such as in hallways or at water coolers. In this edition of the podcast, I speak to Venkat Padmanabhan, Deputy MD (Deputy Managing Director) of MSR India and Ajay Manchepalli. Principal Research Program Manager, about a project called HyWay. HyWay’s a system to support unstructured and semi structured hybrid and informal interactions between groups of in-person and remote participants.

    Venkat Padmanabhan is Deputy Managing Director at Microsoft Research India in Bengaluru. He was previously with Microsoft Research Redmond, USA for nearly 9 years. Venkat’s research interests are broadly in networked and mobile computing systems, and his work over the years has led to highly-cited papers and paper awards, technology transfers within Microsoft, and also industry impact. He has received several awards and recognitions, including the Shanti Swarup Bhatnagar Prize in 2016, four test-of-time paper awards from ACM SIGMOBILE, ACM SIGMM, and ACM SenSys, and several best paper awards. He was also among those recognized with the SIGCOMM Networking Systems Award 2020, for contributions to the ns family of network simulators. Venkat holds a B.Tech. from IIT Delhi (from where he received the Distinguished Alumnus award in 2018) and an M.S. and a Ph.D. from UC Berkeley, all in Computer Science, and has been elected a Fellow of the INAE, the IEEE, and the ACM. He is an adjunct professor at the Indian Institute of Science and was previously an affiliate faculty member at the University of Washington. He can be reached online at http://research.microsoft.com/~padmanab/.

    Ajay Manchepalli, as a Research Program Manager, works with researchers across Microsoft Research India, bridging Research innovations to real-world scenarios. He received his Master’s degree in Computer Science from Temple University where he focused on Database Systems. After his Masters, Ajay spent his next 10 years shipping SQL Server products and managing their early adopter customer programs.

    For more information about the HyWay project, click HyWay - Microsoft Research.

    For more information about the Microsoft Research India click here.

    RelatedMicrosoft Research India Podcast: More podcasts from MSR IndiaiTunes: Subscribe and listen to new podcasts on iTunesAndroidRSS FeedSpotifyGoogle PodcastsEmail

    Transcript

    [Music]

    Sridhar Vedantham: So, Venkat and Ajay, welcome to the podcast.

    Venkat Padmanabhan: Good to be here again.

    Ajay Manchepalli: Yeah, likewise.

    Sridhar Vedantham: Yeah, both of you guys have been here before, right?

    Venkat Padmanabhan: Yeah, it's my second time.

    Sridhar Vedantham: OK.

    Ajay Manchepalli: Same here.

    Sridhar Vedantham: Great! So anyway, we wanted to talk today about this project called HyWay, which, unlike the way the name sounds, is not related to one of your earlier projects which was called HAMS, which actually had to do with road safety. So, tell us a bit about what HyWay is all about and especially where the name comes from?

    Venkat Padmanabhan: Right. Yeah. So, HyWay, we spell it as H Y W A Y. It's short for hybrid hallway. It's really about hybrid interaction. What we mean by that is interaction between people who are physically present in a location- think of a conference venue or an office floor- and people who are remote. So that's where hybrid comes from, and it's really about sort of enabling informal mingling style, chitchat kind of interaction in such settings, which perhaps other platforms don't quite support.

    Sridhar Vedantham: OK. And why come up with this project at all? I mean there are plenty of other solutions and products and ways to talk to people that already are out there. So why do we really need something new?

    Venkat Padmanabhan: Yeah, yeah. So maybe I can give you a little bit of background on this. I think in the very early days of the pandemic, I think in March or April of 2020, you know, all of us were locked up in our respective homes. And obviously there were tools like Teams at Microsoft and equivalent ones like Zoom and so on elsewhere, that allowed people to stay connected and participate in work meetings and so on. But it was very clear very soon that what's missing is these informal interactions, bumping into someone in the hallway and just chatting with them. That kind of interaction was pretty much nonexistent because, you know, if you think of something like a Teams call or, you know, Zoom call, any of those, it's a very sanitized environment, right? If, let's say the three of us are on a Teams call, no one else in the world knows we are meeting, and no one else in the world can overhear us or be, you know, have an opportunity to join us unless they're explicitly invited. So, we said, OK, you know, we want to sort of make these meeting porous, not have these hard boundaries. And that was the starting point. And then as the months went along, we realized that, hey, the world is not going to be just remote all the while. You know people are going to come back to the office and come back to having face-to-face meetings. And so how do you sort of marry the convenience of remote, with the richer experience of being in-person, and so that's where hybrid comes in. And that's something that in our experience, existing tools, including the new tools that came up in the pandemic, don't support. There are tools that do all virtual experiences. But there is nothing that we have seen that does hybrid the way we are trying to do in HyWay.

    Sridhar Vedantham: Right. So, I wanted to go back to something you just said earlier and basically, when you use the term porous, right, and what does that actually mean? Because like you said, the paradigm in which we are used to generally conducting meetings is that it's a closed, sanitized environment. So, what exactly do we mean by porosity and if you are in a meeting environment, why do you even want porosity?

    Venkat Padmanabhan: OK. Maybe I can give an initial answer then maybe Ajay can add. I think we're not saying every meeting is going to be porous, just to be clear, right. You know when you have a closed-door meeting and you know, maybe you're talking about sensitive things, you don't want porosity, right? You want to sort of maintain the privacy and the sanctity of that environment, but when you are trying to enable mingling in a, say, conference setting where you’re sort of bumping into people, joining a conversation, and while you're having the conversation, you overhear some other conversation or you see someone else and you want to go there. There we think something like porosity and other elements of the design of HyWay, which we can get to in a moment, allow for awareness, right? Essentially, allow you to be aware of what else is going on and give you that opportunity to potentially join other conversations. So that's where we think porosity is really important. It's not like it's something that we are advocating for all meetings.

    Ajay Manchepalli: One way to think about this is if you are in a physical space and you want to have a meeting with somebody on a specific topic. You pick a conference room, and you get together and it's a closed-door conversation. However, when you're at a workplace or any location for that matter, you tend to have informal conversations, right? So where you're just standing by the water cooler or you're standing in the hallway and you want to have discussions. And at that point in time, what you realize is that, even though you're having conversations with people, there are people nearby that you can see, and you can overhear their conversations. It's a very natural setting. However, if you're remote and you're missing out on those conversations, how do you bring them into play, right? And where it is not predefined or a planned conversation and you're just gonna happen to see someone or happen to hear someone and join in. And what we talk about is a natural porous nature of air and we are trying to simulate something similar in our system.

    Sridhar Vedantham: OK. So, it's kind of trying to mimic an actual real life physical interaction kind of setting where you can kind of combine some degree of formality and informality.

    Ajay Manchepalli: Correct! And many of these platforms like Teams or Zoom and things like that, it is built on this notion of virtual presence, so you could be anywhere, and you could join and have discussions. However, our concept is more aligned with, how do I get to participate in a physical space? So, our whole philosophy is anchored on a physical location, right? If I want to be in the office by a water cooler where we have interesting conversations, how do I get into those conversations? So that physical anchoring is a very critical aspect of our work.

    Sridhar Vedantham: Interesting! Now, I'm assuming, given that you're looking at quite a different paradigm from what existed earlier, you know Teams, for example, how do you go about figuring out what you want to do? What are the kind of design principles involved? How they actually figure out what people really want to participate in?

    Venkat Padmanabhan: So, I think that's a great question. I would say there's not necessarily one right answer. But you know the way we thought about this, I think Ajay touched on this. You know HyWay is anchored in a physical space. So, at one level, we are just trying to recreate as best as possible that experience in a setting where some people are remote, right? So, what does that mean? Just imagine a physical hallway setting, right? You know, think of a conference, coffee break time, bunch of people out in the hallway in little groups, chatting and moving around. People are very comfortable in that setting because they have a very sort of surprise-free experience. What I mean by that is, I could be chatting with you and with the corner of my eye I can see someone else approaching, I can see Ajay approaching. So, I'm not surprised by his appearance. And so, if I'm talking something sensitive with respect to him, I have the opportunity to change subjects by the time he comes, right. So, we decided that, OK, you know, that's really the core of what makes informal mingling work in the physical world. And we sort of thought about how we can recreate that in a hybrid setting and so awareness is a key principle, right? So, enabling the participants, both the people are physically present in the location as well as the people coming in remotely. I guess we have not really talked about how the setup is. The setup is very simple. We just have large screens with speaker, microphone, camera that are installed, multiple of them, maybe in the location of interest. And so, the remote people are on these screens, right? So whether you're remotely present on a screen or physically there, you have awareness of who else is around, who's approaching. You can overhear bits of other conversations. I mean, not that you can eavesdrop on people, but you know you can just like in the physical world, you can overhear some conversation, you may not know exactly what's being said, but you know that, hey, there's Sridhar there. He's talking to someone, let me go and join him, right? The second principle is this principle of reciprocity. You know, again with video conferencing tools, often in a business context or in a classroom lecture, it's perfectly fine to keep your video off and audio off right and just be listening, because there's a dominant speaker who's sort of running the show often.

    Sridhar Vedantham: Right.

    Venkat Padmanabhan: But if you're enabling chitchat and informal mingling, we believe that doesn't fly, right. And if I'm going to chit chat with someone and I can't even see them, and I can't hear them, that's not going to go very far. So, we insist on this idea, that of sort of see and be seen and listen and be heard, right. So you know, that sort of two-way street. The third is the notion of agency, right? So, it's great to have awareness, but then you need to be able to do something with awareness, right? And what do you do in a physical world? You see someone interesting, or you overhear something interesting, you go there if you want to and then if you lose interest, you come back, right? So the ability to move between conversations in a very fluid and natural way, and we do that with a map interface in our system. And we can talk a bit more about the map and how it is a little different from what other systems have done with the map in a bit. But maybe, Ajay, if you want to add anything to this?

    Ajay Manchepalli: Yeah. The design principle is trying to ensure that the remote person has the same sort of an experience had they been in-person. And some of the aspects that Venkat just talked about, which is awareness and the aspect of reciprocity, and the ability to change or move to different conversations is very natural in a physical setting and we are trying to mimic what you enjoy in a physical setup even for someone who is remotely trying to participate in that conversation. And so it's inspired by that.

    Sridhar Vedantham: Right, so Venkat just referred to this thing about a map, right? So how is it different from what already exists out there? I mean, what really, how do I put this? I mean, how does the map contribute to the experience of a user?

    Venkat Padmanabhan: So, I think the key point is that the map that we have in HyWay, is a representation of the actual physical space where the in-person users are, the physical users are. This is in contrast to, there's a lot of spatial conversation tools that came up, especially in the pandemic that also have a map-based interface. But the map that is completely synthetic, right? So, by anchoring the map to the physical space, we can enable very interesting mingling scenarios across the physical and virtual divide. What I mean by that is, you know, I could be in a conversation, I could be physically present and I could then walk over to a different location. The virtual user can see where I went, and they can follow along. Just like they would be able to do if they were physically there, right. And the reverse can happen to, you know, as a physical user, I can say, hey, there's this remote person I was talking to, now they have moved to a different location. Well, on the map, I'll know where they've gone and I can sort of follow, right. So that correspondence between the virtual view and the physical reality is key, right? That's what enables this kind of mingling across the physical and virtual divide.

    Ajay Manchepalli: To follow up on that, the fact that we anchored to a physical space, I mean the whole premise is that we, a remote person, wants to participate in a particular location of interest implies there has to be some context, right? And so, if I want to go and participate in conversations happening in my office, and specifically, let's say there is a book club that happens in a particular location. I can sort of contextualize myself, I mean, I have that spatial awareness to that space, right? So, if you give me a map-based experience, I can naturally navigate to the space that I'm interested in going to so that sort of one-on-one to one mapping between the physical space and the map helps us stay oriented.

    Venkat Padmanabhan: I'll add one more thing this. I just got reminded, as Ajay was talking. You know, this idea of creating a map that is anchored to the physical reality and tries to sort of portray that actually has worked out quite well. So, we've received feedback from users, but I think there was a user who's actually a visitor to our lab. He's professor at the University of Oregon, or, I think, Oregon State University, who had visited on HyWay remotely in 2022. And then when he came a year later, in 2023, in person his first remark was, “Hey, this place looks so familiar. I've already been here.” It was his first time, right.

    Sridhar Vedantham: OK.

    Venkat Padmanabhan: So, the fact that he was able to navigate around our space using this map that corresponded pretty well to the physical reality gave him familiarity in a way that a synthetic map will simply not give you, right. So, I think that's really the point.

    Sridhar Vedantham: Yeah.

    [Music]

    Sridhar Vedantham: It sounds like when you want to recreate the kind of a physical experience for someone who's there virtually, and you're saying that you'll have multiple spaces where people might congregate or have informal meetings or even formal meetings and so on. I think it logically follows that you're going to have multiple setups or instances of HyWay in various places right. Now, two questions that flow from that. One is, what is the kind of infrastructure that's required to enable a HyWay kind of scenario? And I think equally important is how much does it actually cost? I mean, what kind of infrastructure is required for this?

    Ajay Manchepalli: That is an extremely important question to think through before we deploy, and the way we thought about this is that, the world already has a lot of digital equipment, right? If you think in the context of an office, you have conference rooms which are set up with AV. And you have various, like if you think of cafeterias, you have screens and things of that nature. And so we wanted to use off the shelf equipment to flight HyWay, right. And so that means all we need is a screen so that people who are physically present can see the remote person participating. And we want an AV device, the camera and the microphone, and the speakers, so that the remote person can hear people as well as the in-person folks can hear the remote person. So, it's a very standard AV setup that you can think of which is powered by a basic PC. This is a typical setup. The important thing here to note is that you need to place it in places where it doesn't create blind spots and it's a natural area for mingling. Those are the things that we have to think through when we set up such equipment. And as we iterate through this, we are learning different ways to set things up and make it surprise free. And make it more comfortable for people to not feel like they're being, you know, watched over or something of that nature.

    Sridhar Vedantham: It's interesting you bring that up because my next question was going to be about privacy and letting people know that this thing is on and, you know, there might be someone virtually hanging out there, but how are people actually made aware of this?

    Ajay Manchepalli: Right. I mean, there are there are different ways we can approach it. One, when we set up the space, we provide sufficient visual cues to let people know that this is an area where your colleagues can participate remotely, right. In other words, they have a reason to be there because they know that it's their own colleagues who are going to be participating. #2, we also made a very conscious decision that these experiences have to be ephemeral, right? We do not record, we do not store this information. And if you are in that moment, you get to hear and see people. If you weren't, you have missed it, just as it would have happened if you were in a physical location.

    Sridhar Vedantham: In real life.

    Ajay Manchepalli: In real life, right. So, these two are very important components. Now having said that, you still have cameras out there and you still have a microphone out there and the fact that we have now made it advertised, so to speak, makes them even more nervous, right. Even though people are living in a world with cameras everywhere, from a security standpoint. So, one of the things that we are continuing to work on is this notion of when do we have these systems up and running, right? We don't just have it up and running all the time. You really need that system up and running only if there is a remote person joining in. And so, what we try to do is allow the physical users to have their natural conversations without a video feed of themselves on the screen, but at the same time, provide visual cues to the remote user saying that there are users present in that location so that, that motivates them to get to that point. And once they are in that location, we start the conversation or start the call, which enables the remote person to join in. But we also provide aural queues, right? We'll give an audio chime, that gives an awareness to the people locally present saying that, uh, there's a remote person joining in. This is a new experience. There will be reservations and there will be concerns and I think just as how we transition from, you know, having independent offices to an open desk environment, this is sort of another form of change that we are going through and so we have to be careful how we parse the feedback that we get and keep that in perspective.

    Venkat Padmanabhan: I'll also add just a few things to this, right. You know there are technical design elements in HyWay that try to address this. Ajay touched on no recording for example. We also, for example, don't do any user recognition, right. So, in our map we represent users with face bubbles. But that's just detecting a face. We don't who it is. We don't try to find out who it is, right we just detect the face, we put on the map at the appropriate location, right? So as when you glance at the map, you can see a face bubble of someone there. If you happen to know who that is, you'll recognize them. Otherwise, you'll see, there's a person there, right? The other aspect is reciprocity, which I touched on earlier, so that also I think helps with privacy in the following sense. HyWay will not allow someone to hide themselves and be eavesdropping on a conversation, because they will not be able to hear anything unless they are seen and heard. So, the fact that they are on the screen in a big video tile will make the people who are physically present aware that, hey, there's so and so on the screen. And so, you know, from a privacy viewpoint, it makes it more comfortable. The other thing I would say is, the deployment context, right? So, Ajay sort of talked about I think workspace setting where you have colleagues who know each other and so on. We've also deployed in events which are confined in space and time where people go to a trade show or a, you know, conference with the expectation that there'll be other people around and they will see them, and they will hear them and so on. And you know, things like privacy haven't really been, you know, front and center in those conversations because people are really about it's really about.

    Sridhar Vedantham: They are already in the public space anyway.

    Venkat Padmanabhan: They're in public space, there's no expectation that they are going to be, you know, shielded from the view of others, right. And so, I think the context matters. I think, in a setting where this is running 24/7 and you know an interaction can happen at any time. Yes, you have to, you know, worry about some of the things that you know Ajay talked about but in other settings it's a little easier.

    Ajay Manchepalli: That's an extremely important point. It really depends upon the purpose that people are there for, right, in an event you're going there to learn about something or to meet certain people and you know in that context that there are many people that you're not aware of and it's OK. And yeah, so the context is super important.

    Sridhar Vedantham: So, what kind of events have you deployed in and what’s been the user reaction? Has it been all positive or mixed reactions? How’s it been?

    Ajay Manchepalli: One of the ethos that we have followed is we didn't want to build a system and then think of where do we deploy. We have actually sort of taken an opposite approach. That is we have picked a particular event and we have decided that event needs, when we realize that there is a need for remote participation, then we go deploy the system based on the requirements that have been called out. And we built our system based on that. So, we have been deploying in events where people want to share information about the work that they have done. And you know, things like, let's say your colleagues, all the researchers, they are working on various topics, and we want to let people be aware of what the work is. In that context, we have set up things like poster sessions where it is very natural for us to have booths and have topical presentations that's happening. In this context, initially we decided that, OK, it makes sense to have the ability for a remote person to join in and participate in such events. But through the events, we realized that, once we set up such a system, there are remote presenters who would like to also participate. And so, then we basically modified the system to support not just remote attendees, but to also enable remote presentations, where in-person attendees would actually interact with a remote presenter over the screen. So, these are the things that we have learned through the events. And the surprising results from this is, once we deployed in such events, people experienced it and based on the experience we had the pull of having more events. We didn't have to go and advertise, so to speak. The very experience made people realize the value that we were bringing to the table. In terms of user feedback, it has been overwhelmingly positive. But at the same time, it's a research prototype. So, most of the sort of feedback we have been receiving is more about the system, the audio quality and things like that, which is a systemic improvement that we can do over time. And we also have received feedback which is more in the context of, “Oh, I'm able to do this. Can you also add this” right? You know it's additional capabilities that they would like to see. And especially in the context of events, we haven't really seen any hesitations in terms of the privacy aspect, so to speak. Apart from such events, we have also tried to put it in informal mingling type of events where, OK, there is end of your party and people are getting together and just talking about all the great moments they've had through the year. And people who are remote were able to also participate and mingle with people. And so those are the other type of setups where we have deployed these things. All in all, I would say we have had deployments every couple of months. And each deployment we learn something new and we improve upon the system. Initially, we used to just have their name initials showing up on the map. Then there was feedback that visual cues of who the person is there would be useful because they can recognize that person rather than having their initials and things like that.

    Sridhar Vedantham: So, that's where the face bubble came in.

    Ajay Manchepalli: Face bubble came in, yeah.

    Venkat Padmanabhan: I'll also add one other use case. So far, we have one user for it, but a pretty important user, which is what I would call as remote visits, right? So think of a busy exec who has a global organization and they don't have the time to fly all the way to the other location to visit. You know, obviously you can do remote visit on tools like Teams and Zoom and so on. But when our CEO Satya Nadella came in January of this year (2023), he was remote and he wanted to visit our lab. We decided to do it on HyWay, right. And that worked very well. Essentially, he got on HyWay and walked around the floor of the lab, you know, met with various people. And you know that gave him a sense of the place in a way that just presentations on a call would not have given, right. And that was his reaction, was very positive, and that also gave some sort of momentum to our efforts and sort of finding more use cases for this.

    Ajay Manchepalli: Yeah, got people excited. There is one other scenario that we have been exploring in the recent past and that is in the context of workspace. That is, you have an open office sort of a structure where you have a large table where you have 6 to 8 employees, our colleagues sitting together and working, right. In this sort of a setting, if, with the flexible come into work sort of arrangement that we see these days, you invariably end up seeing only maybe 50-70% of the people showing up in person, and the rest of them working remotely on any given day. And so in order to ensure at the end of the day, we are all about increasing productivity for our for each person, right, on this planet. So. From that context, if we are able to enable the remote employees who are sitting at their home by themselves to sort of feel part of the team, how do we bring them into the context of their workspace? So here it's not informal mingling, but it is more of being present along with your colleagues for a long duration of time where you have serendipitous moments of discussions that happens, which you would have missed if you were staying remote. And that is another scenario that is turning out to be an important scenario for us to look at. Because there is that need that exists and that requires a different way to enable the sort of interactions that we expect within the context of a team.

    Sridhar Vedantham: OK.

    Venkat Padmanabhan: It's really unscheduled interactions that can happen anytime with colleagues.

    Sridhar Vedantham: Right. Much like we keep doing in office actually.

    Ajay Manchepalli, Venkat Padmanabhan: Yes, exactly right.

    Sridhar Vedantham: OK, so where do you see this going in the future? I mean, do you have a road map in mind? Do you really have a, say, a longer-term plan?

    Venkat Padmanabhan: I think we are very much in the phase where we are you know augmenting the system as we go along, right. So Ajay said, you know we've been in this habit of deploying and learning through deployment. So you know we've done a bunch of deployments, we've got feedback, we've had ideas. And so, at this stage we are just looking to, you know, make the system better. And you know, we've deployed a bunch of internal events. One day we might actually take it to external events, external conferences, academic conferences, and so on, and hopefully have a larger audience experience this. And then we'll see where it takes us, right. You know the feedback, as Ajay said has been very positive and what's been particularly satisfying is that it is very clear from the feedback both from users but also from the event organizers, that there's an unmet need here. There is something that people want to do, but existing tools are not doing, which is why they're coming to us. And as I said, we are a small team, we have not advertised ourselves, but people are coming and asking for it, right, internally at Microsoft so far. So that tells us there is unmet need here. So, we think if we do the right things and make the system even better than it is, good things will happen.

    Ajay Manchepalli: And the fact that we are able to deploy and there is more ask for it in spite of the fact that you have microphones and cameras everywhere, tells us that the value that we are bringing outweighs the risks that people perceive, right. And that's a very strong case for the unmet need that Venkat was referring to.

    Sridhar Vedantham: People want to meet people.

    Ajay Manchepalli: Yeah.

    Venkat Padmanabhan: (Laughs) Well said.

    Sridhar Vedantham: All right. So we are kind of coming to the end of this. Is there any last thoughts you want to leave the listeners with?

    Venkat Padmanabhan: Maybe I'll just go back to the earlier point- this is like a more of a specific point. I think when we were discussing hardware requirements and you know deployment models and so on, one of the things we decided early on is to of course keep things simple- low-cost commodity hardware and so on. But you know related to that is this idea that we don't want to impose on the people who are physically present in any way, right. So they just show up and talk and mingle just the way they've done. The only difference is there is an additional screen, maybe in their vicinity, that allows remote people to join. And you know, the unexpected benefit of that really has been that, you know, our system allows bootstrapping for free. So the physical people vote with their feet, they are there regardless of HyWay. They would have been there anyway in the hallway, chatting with each other. But because HyWay is really about instrumenting the environment with these screens and not really imposing on the users, they're automatically, you know, represented in the system. And so now there's a strong incentive for me as a remote user to get on HyWay. And it's not even an app. You just go to the web page, you get on HyWay, because there are other people to interact with. This is in contrast to an approach where you need everyone to do something like install an app or get some hardware or you know, have a wearable computer or whatever to participate. Because then you have the network effects sort of biting you the wrong way, which is, you know, you start with a user base of zero and then you have to really get to a critical mass. We're getting critical mass for free. And we didn't really think of it that way on day one, but that realization has come as we have deployed and sort of learnt from that.

    Sridhar Vedantham: OK.

    Ajay Manchepalli: From HyWay one thing we have learned is that, you know as they say, necessity is the mother of invention. This is a great example of that because it's not that we didn't have remote people before. And it's not that we didn't have technology to support something like this. But we have had this Black Swan moment with COVID, which required us to be not in the same physical location at all time and that accelerated the adoption of digital technologies. That also accelerated the possibility of something like this. And it gave us that flexibility and that led to something that it almost became a no brainer to have such a system. And that has been a key aspect, is that you can build all the technology you want. But having it at the right time and right place matters the most.

    Sridhar Vedantham: And you know, after seeing deployments of HyWay in the lab and stuff, right? I think one thing that kind of came home to me was the fact that you don't have to invest in some extremely expensive infrastructure and fancy VR headsets. You don't need an augmented reality, virtual reality and all that to actually create a kind of a hybrid workspace.

    Venkat Padmanabhan: Yeah, there's a spectrum here and you know, we are at the point where we are focusing on low cost so that we can scale. Obviously, you can do other things with more fancy hardware, but they may be harder to scale. And you know it's sort of the philosophy of a lot of our work. If you remember the previous project, we talked about the HAMS project. Again just a smartphone, rather than instrumenting the vehicle, just put a smartphone and sort of it observes how you're driving and gives you safety feedback, right? So it's an approach that we think works well in many cases.

    Ajay Manchepalli: And when you build something that you want to use every day, it's super exciting to be in such a project. I think we have a bunch of motivated individuals to make this successful.

    Venkat Padmanabhan: So I should say, yeah, you know, I think before we close, you know, I do want to give a huge shout out to a lot of people, but in particular to the team, the HyWay team, right from day one, they've really been the backbone of the project. Bunch of young people, you know, researchers, Research Fellows, RSDEs, that is research software development engineers, all pulled together to do this, and it's a small team. I mean, it's a pretty small team that's done amazing things. So, I really want to call them out.

    Sridhar Vedantham: Great! Thank you for yet another fascinating conversation, Venkat and Ajay.

    Venkat Padmanabhan: Thanks, Sridhar. It was a real pleasure again.

    Ajay Manchepalli: Pleasure is all ours. Thank you.

    [Music ends]

  • Zijn er afleveringen die ontbreken?

    Klik hier om de feed te vernieuwen.

  • Microsoft Research India Podcast – Podcast (5)
    HAMS- Using Smartphones to Make Roads Safer. With Dr. Venkat Padmanabhan and Dr. Akshay Nambi13 jun 2022· Microsoft Research India Podcast

    Episode 013 | June 14, 2022

    Road safety is a very serious public health issue across the world. Estimates put the traffic related death toll at approximately 1.35 million fatalities every year, and the World Health Organization ranks road injuries in the top 10 leading causes of death globally. This raises the question- can we do anything to improve road safety? In this podcast, I speak to Venkat Padmanabhan, Deputy Managing Director of Microsoft Research India and Akshay Nambi, Principal Researcher at MSR India. Venkat and Akshay talk about a research project called Harnessing Automobiles for Safety, or HAMS. The project seeks to use low-cost sensing devices to construct a virtual harness for vehicles that can help monitor the state of the driver and how the vehicle is being driven in the context of the road environment it is in. We talk about the motivation behind HAMS, its evolution, its deployment in the real world and the impact it is already having, as well as their future plans.

    Venkat Padmanabhan is Deputy Managing Director at Microsoft Research India in Bengaluru. He was previously with Microsoft Research Redmond, USA for nearly 9 years. Venkat’s research interests are broadly in networked and mobile computing systems, and his work over the years has led to highly-cited papers and paper awards, technology transfers within Microsoft, and also industry impact. He has received several awards and recognitions, including the Shanti Swarup Bhatnagar Prize in 2016, four test-of-time paper awards from ACM SIGMOBILE, ACM SIGMM, and ACM SenSys, and several best paper awards. He was also among those recognized with the SIGCOMM Networking Systems Award 2020, for contributions to the ns family of network simulators. Venkat holds a B.Tech. from IIT Delhi (from where he received the Distinguished Alumnus award in 2018) and an M.S. and a Ph.D. from UC Berkeley, all in Computer Science, and has been elected a Fellow of the INAE, the IEEE, and the ACM. He is an adjunct professor at the Indian Institute of Science and was previously an affiliate faculty member at the University of Washington. He can be reached online at http://research.microsoft.com/~padmanab/.

    Akshay Nambi is a Principal Researcher at Microsoft Research India. His research interests lie at the intersection of Systems and Technology for Emerging Markets broadly in the areas of AI, IoT, and Edge Computing. He is particularly interested in building affordable, reliable, and scalable IoT devices to address various societal challenges. His recent projects are focused on improving data quality in low-cost IoT sensors and enhancing performance of DNNs on resource-constrained edge devices. Previously, he spent two years at Microsoft Research as a post-doctoral scholar and he has completed his PhD from the Delft University of Technology (TUDelft) in the Netherlands.

    More information on the HAMS project is here: HAMS: Harnessing AutoMobiles for Safety - Microsoft Research

    For more information about the Microsoft Research India click here.

    Related

    Microsoft Research India Podcast: More podcasts from MSR IndiaiTunes: Subscribe and listen to new podcasts on iTunesAndroidRSS FeedSpotifyGoogle PodcastsEmail

    Transcript

    Venkat Padmanabhan: There's hundreds of thousands of deaths and many more injuries happening in the country every year because of road accidents. And of course it's a global problem and the global problem is even bigger. The state of license testing is as that by some estimates of public reports, over 50% of license are issued without a test or a proper test. So we believe a system like HAMS that improves the integrity of the testing process has huge potential to make a positive difference.

    [Music]

    Sridhar Vedantham: Welcome to the Microsoft Research India podcast, where we explore cutting-edge research that’s impacting technology and society. I’m your host, Sridhar Vedantham.

    [Music]

    Sridhar Vedantham: Venkat and Akshay welcome to the podcast. I think this is going to be quite an interesting one.

    Venkat Padmanabhan: Hello Sridhar, nice to be here.

    Akshay Nambi: Yeah, Hello Sridhar, nice to be here.

    Sridhar Vedantham: And Akshay is of course officially a veteran of the podcast now since it's your second time.

    Akshay Nambi: Yes, but the first time in person so looking forward to it.

    Sridhar Vedantham: Yes, in fact I am looking forward to this too. It's great to do these things in person instead of sitting virtually and not being able to connect physically at all.

    Akshay Nambi: Definitely.

    Sridhar Vedantham: Cool, so we're going to be talking about a project that Venkat and you are working on, and this is something called HAMS. To start with, can you tell us what HAMS means or what it stands for, and a very brief introduction into the project itself?

    Venkat Padmanabhan: Sure, I can take a crack at it. HAMS stands for Harnessing Automobiles for Safety. In a nutshell, it's a system that uses a smartphone to monitor a driver and their driving, with a view to improving safety. So we look at things like the state of the driver, where they're looking, whether they're distracted, and so on. That’s sort of looking at the driver. But we also look at the driving environment, because we think, to truly attack the problem of safety, you need to have both the internal context inside the vehicle as well as the external context. So that's the sort of brief description of what HAMS tries to do.

    Sridhar Vedantham: Ok. So, you spoke about a couple of things here, right? One is the safety aspect of, you know, driving both internal and external. When you're talking about this, can you be more concise? And especially, how did this kind of consideration feed into, say, the motivation or the inspiration behind HAMS?

    Akshay Nambi: Yeah, so as you know, road safety is a major concern, not just in India globally, right? And when you look at the factors affecting roads safety, there is the vehicle, there's the infrastructure and the driver. And majority of the instance today focus on the driver. For instance, the key factors affecting road safety includes over speeding, driving without seatbelts, drowsy driving, drunken driving. All centering around the driver. And that kind of started that was motivating towards looking at the driver more carefully, which is where we build the system HAMS, which focuses on monitoring the driver and also how he's driving.

    Sridhar Vedantham: And India in particular has an extremely high rate of deaths per year, right, in terms of in terms of roads accidents.

    Akshay Nambi: Yes, it's on the top list. In fact, around 80,000 to 1.5 lakh people die every year according to the stats from the government. Yeah, it's an alarming thing and hopefully we are doing baby steps to improve that.

    Venkat Padmanabhan: In fact, if I may add to that, if you look at the causes of death, not just road accidents, diseases and so on, road accidents are in the top 10. And if you look at the younger population, you know people under 35 or 40, it's perhaps in the top two or three. So it is a public health issue as well.

    Sridhar Vedantham: And that's scary. Ok, so how does this project actually work? I mean, the technology and the research that you guys developed and the research that's gone into it. Talk to us a little bit about that.

    Venkat Padmanabhan: Sure yeah, let me actually wind back, maybe 10-15 years to sort of when we first started on this journey, and then talk more specifically about HAMS and what's happened more recently. Smartphones, as you know, have been around for maybe 15 years. A bit longer maybe. And when smartphones started emerging in the mid 2000s and late 2000s, we got quite interested in the possibility of using a smartphone as a sensor for, you know, road monitoring, driving monitoring and so on. And we built a system here at Microsoft Research India back in 2007-08, it's called Nericell, where we used a leading-edge smartphone of that era to do sensing. But it turned out that the hardware then was quite limited in its capabilities in terms of sensors, even accelerometer was not there. We had to pair an external accelerometer and so on. And so the ability for us to scale that system and really have interesting things come out of it was quite limited. Fast forward, about 10 years, not only did smartphone hardware get much better, AI and machine learning models that could process this information became much better and among the new sensors in the newer edge smartphones or the cameras, the front camera and the back camera. And machine learning models for computer vision have made tremendous progress. So that combination allowed us to do far more interesting things than we were able to back then. Maybe Akshay can talk a bit more about the specific AI models and so on that we built.

    Akshay Nambi: Yeah, so if you compare the systems in the past to HAMS, what was missing was the context. In the past, systems like what Venkat mentioned- Nericell, right, it was correcting the sensor data, but it was lacking context. For example, it could tell did the driver did this rash braking or not, but it could not tell, did he do it because somebody jumped in front of the vehicle, or was he distracted? These cameras new smartphones have can provide this context, which makes these systems much more capable and can provide valuable insights. And in terms of specific technology itself, we go with commodity smartphones, which have multiple cameras today. The front camera looking at the driver, the back camera looking at the road, and we have built numerous AI models to track the driver state, which includes driver fatigue and driver gaze, where the driver is actually looking. And also with the back camera we look at how the driver is driving with respect to the environment. That is, is he over speeding, is he driving on the wrong side of the road and so on.

    Sridhar Vedantham: So, this is all happening in real time.

    Akshay Nambi: The system can support both real time and also offline processing. And as you know smartphones are intelligent edge devices, but still they have limited processing power. So, we decide what some of the capabilities should run in real time, and some can be offloaded to the cloud. Or some could be for offline processing.

    Sridhar Vedantham: OK.

    Venkat Padmanabhan: I want to sort of make a distinction between our ability to run things in real time, which has Akshay said, you know, many of our, actually much of our research was in making the computation inexpensive enough so that you can run real time and the user interface. So, we explicitly decided early on in our journey that we did not want a system that intervened in real time, and you know provided alerts, because the bar for that, if you will, is very high. In the sense that you don't want to make a mistake, like if you alert a driver to driver and that alert is actually a mistake, you might actually cause an accident, right? And since we were shooting for a low-cost system with just a smartphone and so on, it did not seem like a reasonable thing to sort of aim for that. What we really aim for is processing that is efficient, that actually doesn't overheat the phone, you know, the processing can sometimes just cause a smartphone to meltdown. But at the same time, depending on the context and we'll get to, I guess our driver testing application soon hopefully, we can offload computation to either a more capable edge device nearby or to the cloud as Akshay said. and we definitely want to leverage that. We're not sort of bound to just the smart phone for compute.

    Sridhar Vedantham: Right, so you know you spoke about the fact that you're using commodity hardware to do all this right? And it's always fascinated me that today's consumer device, basically the commodity hardware that you get, is capable of doing a lot of stuff. But even then, there must have been challenges in taking you know, maybe a 25,000 Rupee or 20,000 Rupee phone just off the market and trying to get it to do things that, uh, sound like they should be running in the cloud and not on the phone, frankly. Did you have any of these challenges?

    Akshay Nambi: Oh, numerous of them. So, to start with the low-cost smartphones, as you turn on cameras, most of you would have realized the phone gets heated up much quickly than in a normal setup. While the system setup itself is very simple, which is just a smartphone, to build a system on top of smartphone, there are numerous challenges. Starting with real world conditions, that is, there is different lighting as you drive in on roads, there is daytime, nighttime. How does your algorithm adapt to these conditions? There are different types of vehicles, hatchback, SUV. How does your algorithm adapt to these. Different driving seating positions, the way you mount the smartphone in the vehicle. All of these can change, right? So getting your AI models to work in such a dynamic setup is the biggest challenge. And one of the key research in HAMS is to address these challenges in a practical way. That's been one of our key focus. Second, coming to the hardware itself, since we want to do some of these processing on the smartphone itself, you have to come up with algorithms that are efficient. Which means, today smartphone’s cameras can generate 30 frames per second. But the hardware, the compute power is not there to process all the 30 frames. So you have to come up with intelligent algorithms to decide, which frame do you want to process, which frame you want to discard, or which frame you want to apply a lower algorithm compared to a better algorithm. So there are lot of these divisions which have to go through to build this system.

    Sridhar Vedantham: Right.

    Venkat Padmanabhan: Just to add to what Akshay said, you know if I step back right, there are, I would say, two major pillars of our research. One is being adaptive. For example, Akshay talked about the processing being expensive. Let's say you're trying to do vehicle ranging. We're trying to figure out whether the driver is tailgating, being too close to the vehicle in front. There are very good machine learning models that will do object detection, you know, find the vehicles in front of you in a in a frame and so on, but they're expensive. Instead, we could combine that approach with a much less expensive tracking algorithm so that once you found an object, you just use the cheaper tracking algorithm for a while because the vehicle in front of you is not going to disappear. If it's in front of you now chances are that for the next several seconds it'll be there, so this is adaptivity. The other aspect is auto calibration or calibration in general. As Akshay said, you know, vehicle geometry, the mounting of the phone, the driver seating position, all that changes and it is not practical to recalibrate each time a driver mounts a phone and starts, obviously. So we need ways of automatically calibrating and learning system. So, I would say a lot of our technical work and all the research we've published falls in these two buckets.

    [Music]

    Sridhar Vedantham: One thing that either one of you mentioned I don't remember exactly who it was, but you mentioned this thing about being able to detect if a driver is, say, drowsy. How does that actually work? Because you know it just sounds a little science fictiony to me, right? How a phone is going to be sitting there mounted on the dashboard or on the window or windshield of a car and if someone magically telling you whether the driver is sleepy or drowsy or whether the driver is doing the right thing while driving.

    Venkat Padmanabhan: You're right. I mean, knowing the true internal state of a driver is not easy, but there are outward manifestations of how tired someone is or how sleepy someone is. Obvious ones are, your eyes sort of drooping and also yawning and things like that. And so those are things that we can pick out by just having a camera looking at the driver's face. We're not claiming that this is 100% percent perfect in terms of knowing driver state, but these are good indicators. In fact, we have some interesting data from our work and maybe you know actually can talk about this long drive he went on and where these outward signs actually correlated with his own perception.

    Akshay Nambi: Yeah, that's true. So there are several studies which has looked at eye blinking and yawning patterns to say the state of the driver and we developed this one and in fact we were going for an interstate travel where we deployed HAMS in our own cab. It was early morning, so the driver was just awake, and he was fresh where he was driving well and our algorithms were also detecting that he was active. And we stopped for breakfast and then the system started beeping, directing the eye blinking and yawning state was much higher and, we were in the cab and we did not notice it. And the system was able to detect. He was an experienced driver, yes, that made a lot of sense, but still he was drowsy.

    Venkat Padmanabhan: As a safety conscious person, Akshay should have stopped the cab right then, but he was part of the research project. He wanted the data so he kept going. He kept going, yeah.

    Sridhar Vedantham: He is truly committed to the science. Cool, have there been other projects to do with traffic and road safety and so on anywhere in the world? And how does what you guys are doing differ from those things?

    Venkat Padmanabhan: Yeah, so let me start and maybe Akshay can add to it. It's a very rich space right. Around the same time we started Nericell in the mid-2000s, a bunch of other people started startups as well as university groups like at MIT and Princeton and so on. IIT Bombay had an active project that was looking at similar things. And then, as I said, HAMS, we did sort of about 10 years later. The biggest distinction I would say is what Akshay touched on in the beginning which is that compared to a lot of these existing systems that people use, including what insurance companies now use, at least in the western world, we have these camera sensors that allow us to get the context of a driver. I think Akshay gave an example of the driver being distracted. I'll give you an even simpler example. Now think of something like speeding. You would imagine a speed is very easy to get. You have a GPS device, pretty accurate, that will give you speed. But when you talk about speeding and the safety issues related to that, it is not just your vehicle speed.

    Sridhar Vedantham: It's contextual.

    Venkat Padmanabhan: It is how you're doing relative to other vehicles. If you know the speed limit is 80 kilometers, others are going at 90 or 100. It is safer to go at 90 or 100 than to be 80, right?

    Sridhar Vedantham: Yeah.

    Venkat Padmanabhan: So that context is something that you get only with camera-based sensing and that for the most part the other work is not looking at. I would say we are perhaps among the earliest to look at that as part of the mix for this problem.

    Sridhar Vedantham: Ok, and uh, I know this project is, I mean, you guys have been working on this project for about three years, four years now?

    Venkat Padmanabhan: Oh, longer. We actually started in 2016.

    Sridhar Vedantham: 2016.

    Venkat Padmanabhan: So it's a bit longer, yeah.

    Sridhar Vedantham: Right, and I know that before we had this unfortunate pandemic, there used to be all these weird posts and signs in the basement of the office here, which I was told Akshay put up. What are those for and they told me that, I'm not supposed to run over those signs or move them and all that because there for a very important research project that Akshay is running called HAMS. What were you guys doing there in the basement?

    Akshay Nambi: Right. While we spoke about various detectors in terms of understanding the driver state how he's driving, one of the key things which we have developed Is how the driver drives. Specifically, we look at the trajectory of the driving itself. When we talk about trajectory, today you can use GPS to get the trajectory of how the vehicle is being driven. The accuracy of these GPS devices are in meters. Especially if we are now trying to understand how the driver is parking in a reverse parking position, or in a parallel parking position, you want to understand the trajectory in centimeter level. How many forwards he took, how many reverses he took. And to do that we have come up with a way where we use visual cues, which is basically the features in the environment plus some of these markers which you have seen in the basement, which provides you very accurate localization up to a few centimeters. And that's how we were able to get the accurate trajectory. And now this trajectory can be used for various applications. One, as I said, for seeing how the driver is parking, it could be used for how the driver is driving in an open environment, and this was mainly to see how new drivers are actually driving the vehicle.

    Sridhar Vedantham: OK.

    Venkat Padmanabhan: You can think of these markers, I guess they're called fiducial markers as you know, the beacons of a lighthouse. It gives you a reference point, right? So you can locate yourself accurately down to centimeter level using these as reference points.

    Sridhar Vedantham: Ok. Now I also know that you've implemented HAMS at scale at various places. Uh, can we talk a bit about that, where it's been implemented and what it's being used for?

    Venkat Padmanabhan: That's a good question, Sridhar. Let me provide the broader context here, and then maybe Akshay can chime in with the details. As I said, we started HAMS in 2016. And in the in the year that followed, we looked at many different possible potential applications. For example, fleet monitoring, driver training and so on. Now, as luck would have it, we got connected with a very interesting organization called Institute for Driving and Traffic Research, which is run by Maruti Suzuki, which is India's largest car manufacturer, and they are very much interested in issues of road safety and also being Maruti and such a big player in the vehicle business, they're very well connected with the government. So, in late 2018 we went to the Ministry of Road Transport and Highways along with them and met with the senior bureaucrat and what was supposed to be a 15 minute meeting went on for over 2 hours because the person we met with really liked the potential of HAMS. In particular, in the context of driver testing. As you know, before you're granted a license to drive, you're supposed to be tested, but the reality in India is that because of the scale of the country and population, and so on, anecdotally, and even some studies have shown, that many licenses are issued without a proper test or without a test at all. Which obviously means untested and potentially unsafe drivers are on the road and they're contributing to the scale of the problem. Now the government is alive to this problem, and they're quite interested in technology for automation and so on. But the technology that people were using was quite expensive and therefore difficult to scale. So from this conversation we had in 2018, what emerged is that our simple smartphone based solution could be used as a very low cost, but at the same time, high coverage solution. When I say high coverage, it can actually monitor many more parameters of a drivers’ expertise in driving than the existing high-cost solutions. So that really got us started and maybe Akshay can talk about where that journey led us.

    Akshay Nambi: This meeting what Venkat mentioned with the ministry in Delhi, led us to talking to the government of Uttarakhand who were looking to set up a greenfield automated license testing track in Dehradun. This was the first implementation of HAMS to provide automated license testing. And remember, this was a research project. Taking this research project to a actual deployment with the third-party organization and government being involved was a major task. It's not just about technology transfer, it's making the people on the ground understand the technology and run with it every day. We have spent several months with the government officials to translate the research project to a actual real-world deployment, which also included translating the parameters in their driver license testing guidelines to something which can be measured and monitored in real-world. For instance, this would mean, is the driver wearing a seat belt and the amount of time it took to complete a particular maneuver. All of these things need to be monitored by the system. And this translation was one of the biggest challenges taking their research project to a real-world deployment. This deployment went live in July 2019, where the entire test was completely automated. Automated here means that there is no inspector sitting in the vehicle. So, the inspector comes into the vehicle, deploys smartphone and the exits. You as a candidate drive within a confined area which has multiple maneuvers, and the smartphone monitors how you are driving. And has a bunch of parameters which is being defined based on which you would be marked. At the end of the test, basically, a report will be automatically generated which says which maneuvers you passed, which maneuvers you failed along with video evidence why you failed, will be provided to the candidate. And the final result will be uploaded to the central government for issuing the licenses.

    Sridhar Vedantham: This is very interesting. What has been the reaction of people to this? I'm sure that when people suddenly saw that they're going to be evaluated by a smartphone, it must have thrown them for a loop, at least in the initial stages.

    Akshay Nambi: Much to our surprise, it was completely opposite. They were very much welcoming their smartphone than an inspector in the vehicle. (Laughs)

    Venkat Padmanabhan: I think people trust a machine more than a person. Because they feel that person perhaps can be biased and so on, whereas the machine they just trust. In fact, the comments we got also said, you know, look, I failed the test, but I like the system.

    Sridhar Vedantham: So people seem to be happy to remove the human error part of the thing. Out of the equation.

    Venkat Padmanabhan: The subjectivity, right?

    Sridhar Vedantham: Yeah, the subjectivity, yeah.

    Venkat Padmanabhan: They feel the system is objective. The other thing I should mention, which obviously we didn't plan for, and we didn't anticipate. After COVID happened, this idea that you take the test without anyone else in the vehicle gained a new significance because you know things like physical distancing became the norm. You could take a test with just a smartphone and not have to worry about sitting next to an inspector or inspector worrying about sitting next to a driver.

    Venkat Padmanabhan: And that was an unexpected benefit. Of our approach.

    Sridhar Vedantham: Right? That's very interesting. I never thought of this, although I've been tracking this project for a while.

    Venkat Padmanabhan: Neither did we. It just happened and then we realize how you know, in retrospect, it was a good idea.

    Sridhar Vedantham: Yeah. And what's the road map? Where does the project go now?

    Venkat Padmanabhan: Yep. Akshay talked about Dehradun deployment that happened in 2019. That really sort of caught the attention of several state governments. In fact, they sent their people to Dehradun and to see how the system was working and came back quite impressed. So there was deployments in Bihar, deployments happening in Andhra Pradesh, and some sites in Haryana and several other states that are in discussion to deploy the system. So at this point we have four RTOs that are live and with a couple of more than almost live, they're pretty much ready to go. And about a dozen more that are in the works in various stages. But of course there are thousand RTOs in the country. So, there's still a long way to go, and one of the challenges is that this has to proceed on a state by state basis, because it is...

    Sridhar Vedantham: A state subject.

    Venkat Padmanabhan: It's a state subject, exactly. But we are working with external partners who we have enabled with the HAMS technology.

    Sridhar Vedantham: Venkat, it sounds like this project has some serious potential for large societal impact.

    Venkat Padmanabhan: That's indeed the case Sridhar. In fact, we think there's huge potential here for beneficial impact, and that's what really been driving us. Just to give you context for the numbers. The scale of the problem we already talked about, there's hundreds of thousands of deaths and many more injuries happening in the country every year because of road accidents. And of course, it's a global problem and the global problem is even bigger. The state of license testing is as that by some estimates of public reports, over 50% of licenses are issued without a test or a proper test. So we believe a system like HAMS that improves the integrity of the testing process has huge potential to make a positive difference. Now, where we are in the journey today is that we have done about 28,000 automated license tests using HAMS across all these states where it's been deployed. But an estimated 10 million or more license tests or licenses are issued in the country every year. So we think that by scaling to the 1000 plus RTOs that I talked about earlier, we can actually, potentially touch a majority, or perhaps even all of these license tests that are happening and license being issued, and thereby have much safer roads in the country by having, you know, drivers are well tested and really ready to drive. Making the only ones who are on the road.

    Sridhar Vedantham: Fantastic, now we are coming towards another podcast. Are there any thoughts that you'd like to leave the listeners with before we wind up?

    Venkat Padmanabhan: Sure, I can share something and then maybe Akshay can add to it as well. I would say you know stepping back from the specific use case of HAMS and you know road safety and so on, what this experience has taught us is that, if you take technology and mate it with a significant, a societally significant problem. In this case, road safety, but really understand the problem, work with partners like we did. You know, we worked with IDTR, the Maruti IDTR, we worked with other external partners, talked to the government, multiple governments at the center and the state, and so on. Really understand the problem, understand technology and bring it together in a meaningful way, we can make a huge difference, and that's really quite inspirational for us because it tells us that there's a lot of good we can do as technologists and researchers.

    Akshay Nambi: Nothing much to add to Venkat, he nicely summed it up. But I think one just minor point would be that we don't have to look for problems elsewhere. Problems are just right next to us. And picking up these societal impact problems, have lot of value.

    Sridhar Vedantham: OK, fantastic, thank you both for your time.

    Venkat Padmanabhan: Thanks Sridhar. It’s been a pleasure.

    Akshay Nambi: Thanks Sridhar, this was very good.

  • Microsoft Research India Podcast – Podcast (6)
    A Random Walk From Complexity Theory to Machine Learning. With Dr. Neeraj Kayal and Dr. Ravishankar Krishnaswamy30 mei 2022· Microsoft Research India Podcast

    Episode 012 | May 30, 2022

    Neeraj Kayal: It’s just a matter of time before we figure out how computers can themselves learn like humans do. Just human babies, they have an amazing ability to learn by observing things around them. And currently, despite all the progress, computers don't have that much ability. But I just think it's a matter of time before we figure that out, some sort of general artificial intelligence.

    Sridhar Vedantham: Welcome to the MSR India podcast. In this podcast, Ravishankar Krishnaswamy, a researcher at the MSR India lab, speaks to Neeraj Kayal. Neeraj is also a researcher at MSR India and works on problems related to or at the intersection of Computational Complexity and Algebra, Number Theory and Geometry. He has received multiple recognitions through his career, including the Distinguished Alumnus award from IIT Kanpur, the Gödel prize and the Fulkerson Prize. Neeraj received the Young Scientist Award from the Indian National Science Academy (INSA) in 2012 and the Infosys Prize in Mathematical Sciences in 2021. Ravi talks to Neeraj about how he became interested in this area of computer science and his journey till now.

    For more information about the Microsoft Research India click here.

    Related

    Microsoft Research India Podcast: More podcasts from MSR IndiaiTunes: Subscribe and listen to new podcasts on iTunesAndroidRSS FeedSpotifyGoogle PodcastsEmail

    Transcript

    Ravi Krishnaswamy: Hi Neeraj, how are you doing? It's great to see you after two years of working from home.

    Neeraj Kayal: Hi Ravi, yeah thank you.

    Thank you for having me here and it's great to be back with all the colleagues in office.

    Ravi Krishnaswamy: First of all, congratulations on the Infosys prize and it's an amazing achievement.

    And it's a great privilege for all of us to have you as a colleague here.

    So, congratulations on that.

    Neeraj Kayal: Thank you.

    Ravi Krishnaswamy: Yeah, so maybe we can get started on the podcast. So, you work in complexity theory, which is I guess one extreme of, I mean, it's very theoretical end of the spectrum in computer science almost bordering mathematics. So hopefully by the end of this podcast we can, uh, I mean, convince the audience that there's more to it than intellectual curiosity. Before that right, let me ask you about how you got into theoretical computer science and the kind of problems that you work on. So, could you maybe tell us a bit about your background and how you got interested into this subject?

    Neeraj Kayal: Yeah, so in high school I was doing well in maths in general and I also wrote some computer programs to play some board games, like a generalized version of Tic Tac Toe where you have a bigger board, say 20 by 20, and you try to place five things in the row, column, or diagonal continuously and then I started thinking about how could a computer learn to play or improve itself in such a game? So, I tried some things and didn't get very far with that, but at that time I was pretty convinced that one day computers will be able to really learn like humans do. I didn't see how that will happen, but I was sure of it and I just wanted to be in computer science to eventually work on such things. But in college in the second year of my undergrad, I enrolled for a course in cryptography taught by Manindra Agrawal at IIT Kanpur and then the course started off with some initial things which are like fairly predictable that something called symmetric key cryptosystems where, essentially it says that let's say we two want to have a private conversation, but anyone else can listen to us. So how do we have a private conversation? Well, if we knew a language, a secret language which no one else did, then we could easily just converse in that language, and no one will understand this. And so, this is made a little more formal in this symmetric key cryptosystem. And then, one day, Manindra ended one of the lectures with the following problem: but now suppose we did not know a secret language. Then we just know English, and everyone knows English and then how do we talk privately when everyone can hear us? I thought about it for a few days. It seemed completely impossible. And then Manindra told us about these wonderful cryptosystems, called the Diffie Hellman cryptosystem and the RSA cryptosystem where they achieved this and it was very surprising. And the key thing that these cryptosystems use is something that lies at the heart of computer science, a big mystery still even to this day at the heart of computer science. There are these problems which we believe are hard for computers to solve in the following sense, that even if a computer takes a very long amount of time, if we give it a fairly long amount of time, a reasonable amount of time it cannot solve it. But if we give it time like till the end of the universe, it can in principle solve such problems. So that got me interested into which problems are hard and can we prove they are actually hard or not? And to this day, we don't know that.

    Ravi Krishnaswamy: So, I'm guessing that you're talking about the factoring problem, right?

    Neeraj Kayal: Yes, factoring is one of the big ones here. And the RSA cryptosystem uses factoring.

    Ravi Krishnaswamy: So, it's actually very interesting, right? You started out by trying to show that some of these problems are very, very hard, but I think, looking back, your first research paper, which happens to be a breakthrough work in itself, is in showing that a certain problem is actually easier to solve. Then we had originally thought right so, it is this seminal work on showing that primality testing can be solved in deterministic polynomial time. I mean, that's an amazing feat and you had worked on this paper with your collaborators as an undergrad, right?

    Neeraj Kayal: Yes.

    Ravi Krishnaswamy: Yeah, that's an incredible achievement. So maybe to motivate others who are in undergrad and who have an interest and inclination in such topics, could you maybe share us some story on how you got working in that problem and what sort of led you to this spark that eventually got you to this breakthrough result?

    Neeraj Kayal: So, my advisor Manindra, who also was the professor who taught us cryptography - he had been working on this problem for a long time and there were already algorithms that existed which are very good in practice- very very fast in practice, but they had this small chance that they might give the wrong answer. The chance was so small that practically it did not matter, but still as a mathematical challenge, it remained whether we could remove that small chance of error, and that's what the problem was about. So, Manindra had this approach and he had worked with other students also- some of our seniors- on it, and in that course, he came up with a conjecture. And then when we joined, me and my colleague Nitin, we joined this project , we came across this conjecture and my first reaction was that the conjecture is false. So, I tried to write a program which would find a counterexample and I thought we would be done in a few days-Just find that counterexample and the project would be over. So, I wrote a program- it will train for some time, didn't find a counterexample, so I decided to parallelize it. A huge number of machines in the computer center in IIT Kanpur started looking for that counterexample. And then to my surprise, we still couldn't find the counterexample. So there seemed to be something to it. Something seemed to be happening there which we didn't understand, and in trying to sort of prove that conjecture, we managed to prove some sort of weaker statement which sufficed for obtaining the polynomial time algorithm to test if a number is prime or not. But it was not the original conjecture itself. Many days after this result came out, we met a mathematician called Hendrik Lenstra who had worked on primality testing, and we told him about this conjecture. And after a few days he got back to us and it showed that if you assume some number theoretic conjecture is true, which we really really believe, it's true.

    Ravi Krishnaswamy: Ok, I see. So, the original conjecture, which you hoped to prove true is false, but the weaker conjecture was actually true, you proved it to be true, and that was enough for your eventual application.

    Neeraj Kayal: Yes, so in some sense we are very lucky that in trying to prove something false we managed to prove something useful.

    Ravi Krishnaswamy: Yeah, I mean it's a fascinating story, right? All the experiments that you ran pointed you towards proving it, and then you actually went and proved it. If you had found, I imagine what would have happened if you found a counterexample at that time, right?

    Neeraj Kayal: So yeah, Hendrix proof was very interesting. He showed that modulo this number theory conjecture a counterexample existed. But it would have to be very, very large and that's why you couldn't find it. So, he explained it beautifully.

    Ravi Krishnaswamy: Yeah, thanks for that story Neeraj. So. I guess from then on you've been working in complexity theory, right?

    Neeraj Kayal: That's right, yeah.

    Ravi Krishnaswamy: So, for me at least, the Holy Grail in complexity theory that I've often encountered or seen is the P versus NP problem, which many of us might know. But you've been working on a very equally important, but a very close cousin of the problem, which is called the VP versus VNP problem, right? So, I'm going to take a stab at explaining what I understand of the problem. So, correct me whenever I'm wrong. So, you are interested in trying to understand the complexity of expressing polynomials using small circuits. So, for example, if you have a polynomial of the form X ^2 + Y ^2 + 2 XY, you could represent it as a small circuit which has a few addition operations and a few multiplication operations like you could express it as X ^2 + Y ^2 + 2 XY itself. Or you could express it as (X + Y)^2. Which may have a smaller representation in terms of a circuit. So, you have been working on trying to identify which polynomials have small representations and which polynomials are natural but still don't have small representations.

    Neeraj Kayal: That's right.

    Ravi Krishnaswamy: Is that a reasonable approximation of the problem you're thinking about?

    Neeraj Kayal: Yes, that's right. So, another way to put the same thing is what is the power of computation when you do additions, multiplications, subtractions, all these arithmetic operations. You could include division, square roots also.

    Ravi Krishnaswamy: So, I have seen this VP class and it makes a lot of sense to me. It's the set of all the polynomials that can be captured by small sized circuits with the plus I mean addition and multiplication gates. I've also seen the VNP class, which seems to me at least to be a bit mysterious, right? So, these are all the polynomials whose coefficients of the individual monomials can be computed efficiently. Is that a reasonable definition, at least? Is my understanding correct?

    Neeraj Kayal: Yeah, that's the technical definition of this class, but there's another natural sort of intuition why we want to look at it, and the intuition is that it relates to counting the number of solutions to a problem, and also therefore to computing probabilities of various things happening.

    Ravi Krishnaswamy: I see. Ok, so that gives me a lot more understanding. I guess when you're able to estimate probabilities, you could also do sampling over those objects.

    Neeraj Kayal: Yes exactly.

    Ravi Krishnaswamy: Yeah, that's a very nice connection. I did not know about this. Thanks for that. So, you have been working, you have an agenda on trying to show some sort of a separation between the two classes, right, VP and VNP, by constructing these low depth circuits. So, you're able to show that all polynomials in VP have admit the low depth representation and your hope in this agenda is to find one polynomial in VNP which does not have a low depth representation, right?

    Neeraj Kayal: That's right.

    Ravi Krishnaswamy: So, how far are you in this agenda and do you think we have all the tools needed to actually achieve success through this sort of a method?

    Neeraj Kayal: Yeah, so just historically for converting a circuit or a program into a low depth program, this was done earlier. Most of this work was done by other people. So, we haven't contributed much in that direction. We have been trying to prove certain polynomials don't have small depth and small sized arithmetic circuits. So, it's not clear to us whether the existing techniques are good enough to prove this or not. And like on Mondays, Wednesdays, and Fridays, I think they are capable maybe, and on the other days I think maybe not. And this is what researchers generally deal with. Especially in these areas where you don't know whether your tools are good enough or not. And very recently, just last year, there was a big breakthrough by trio of complexity theorists who showed somewhat good lower bounds for all constant depth arithmetic formulas or circuits. And what was surprising also about this result is that, they use in a very clever way, techniques that were already known.

    Ravi Krishnaswamy: So, they would have probably shown it on a Monday or Wednesday or Friday.

    Neeraj Kayal: Yes, yes. [Laughs]

    Ravi Krishnaswamy: OK, that's very interesting. So, you still don't know whether this will lead to success or not through this route.

    Neeraj Kayal: Yes, yeah, we still don't know that.

    Ravi Krishnaswamy: Are there other people approaching this problem through other techniques?

    Neeraj Kayal: So, there's a program called the Geometric Complexity Theory program initiated independently by other people who basically try to understand symmetries. Because implicit in this question is a whole bunch of symmetry, then they try to exploit that. And there's a field of mathematics called group theory and representation theory, which is all about understanding symmetries of objects. That area is beautiful, really beautiful, and a lot of advancement has been made there. So, people have been trying to also attack this problem through using those tools.

    Ravi Krishnaswamy: Yeah, that's very nice, I think. So basically, you're saying a lot of like diverse techniques from math and computer science are at play here and trying to help you on your progress.

    Neeraj Kayal: That's right.

    Ravi Krishnaswamy: I see. I mean, it's very beautiful. I find it fascinating and beautiful that a lot of these different diverse techniques from mathematics and computer science come into play into establishing these lower bounds. And what's more fascinating to me is that they are all not just from an intellectual curiosity standpoint. They seem to be powering a lot of things that we take for granted, right, right from, like, as you said, messaging each other through social networks or whatever it is. They seem to be like at the foundation- the inherent hardness of certain problems seem to be at the foundation of a lot of things that we take for granted.

    Neeraj Kayal: Yeah, that's right, Ravi. So, for example, I do transactions using my mobile phone and anyone who is within a reasonable distance of my mobile phone can read all the signals that my phone is sending. So, they can see all the communication that I'm having with the bank. And the fact that despite that they are not able to infer my banking passwords relies on the fact that certain problems are very inherently hard to solve and that's what we are trying to prove.

    Ravi Krishnaswamy: OK, so that's very interesting Neeraj. And in the last part of this podcast, I want to flip the topic around a little bit. So, you've been working a lot on showing lower bounds, and in lower bounds in arithmetic complexity. But lately in the last couple of years you have also been using those insights into showing some very nice algorithms for some learning problems. I find that also very cool, so maybe you can talk a little bit about that.

    Neeraj Kayal: Yeah, so the algorithms that we are trying to devise are trying to solve the following problem. More general version of it is the following. Given a function or a polynomial, what's the smallest number of operations that you need to do to be able to compute that function or polynomial? So, for Boolean functions this has a very long history. That essentially is like designing chips, and you can imagine it was naturally very useful to think about. But more recently, it turns out a lot of works have found another very surprising connection because of which this problem specifically for polynomials has also become very interesting. And the connection is this. Suppose you have some very big data set. For now, think of this data set as consisting of a bunch of points in high dimensional space. For example, you can think of images as a point, every image as a point in the high dimensional space. Now it turns out that you can take statistics of this data. So, for example, you can take what's the average value of the first coordinate, what's the average value of the second coordinate? Or what's the average value of the product of the first two coordinates in this data set and so on. So, you can take some of these statistics, encode them as the coefficients of a polynomial. And here's the interesting part. When the data has some very nice structure, then this polynomial tends to have a small circuit.

    Ravi Krishnaswamy: I see.

    Neeraj Kayal: And so, when you want to understand the structure of data, so this general area is called unsupervised learning. Turns out that it's useful to find small circuits for polynomials. So, this is the computational problem that we are looking at: given a polynomial, what's the smallest number of operations, or what's the smallest circuit representing this polynomial.

    Ravi Krishnaswamy: So, if you're able to find the smallest circuit representing this, then from that you will be able to infer the underlying distribution or the structure of the underlying data.

    Neeraj Kayal: Yes, yes, that's right. So, this is one connection, and it also turns out that the lower bounds that we are proving, showing that certain things are very hard to compute are also useful for now devising algorithms to find the circuits of polynomials which do have small circuits and maybe let me give you some very rough sense of how that comes about, and I find this a bit fascinating. Here's how the lower bounds proofs work. So, underlying all those lower bounds for the various subclasses of circuits that we do have is a collection of linear maps and now it turns out that when you are given a polynomial which has a small circuit, using this polynomial and the collection of linear maps, which go into the lower bound proof you can form another big linear map, such that, very roughly, the eigen spaces of this new linear map correspond to the smallest circuit for F.

    Ravi Krishnaswamy: I see.

    Neeraj Kayal: And this was the connection that we discovered some time ago, which helped us find small circuits.

    Ravi Krishnaswamy: So, you find small circuits by computing the eigen space of the of the map.

    Neeraj Kayal: Yes, of this other linear map. That's right Ravi.

    Ravi Krishnaswamy: I see that's very nice. Ok, so I think we covered a lot of the topics that I wanted to cover, so maybe I'll end with two philosophical questions. So, one is you began the podcast by talking about how as a kid, you thought computers or machines could be able to do everything that human intelligence can do. So, I think it's a vague question, but what's your take on that now? And two is what advice would you give for budding theoreticians, whether they're in school or college or grad school? What sort of advice would you give them?

    Neeraj Kayal: So, for the first question, Ravi, I know a lot of other people also share this feeling, that it’s just a matter of time before we figure out how computers can themselves learn like humans do. Just human babies, they have an amazing ability to learn by observing things around them. And currently, despite all the progress, computers don't have that much ability. But I just think it's a matter of time before we figure that out, some sort of general artificial intelligence. To your second question, Ravi, I don't have much to offer other than perhaps a banal comment that anyone looking to work in this area should really enjoy thinking about these kinds of problems. They tend to be rather abstract, sometimes the applications are not always apparent, but if you enjoy thinking about them, I'm sure you'll do well.

    Ravi Krishnaswamy: That's great, Neeraj. It's been a pleasure chatting with you. Thanks a lot for your time and hope you had fun.

    Neeraj Kayal: Yeah, thanks Ravi. Thanks a lot for having me.

  • Microsoft Research India Podcast – Podcast (7)
    Collaborating to Develop a Low-cost Keratoconus Diagnostic Solution. With Dr. Kaushik Murali and Dr. Mohit Jain17 jan 2022· Microsoft Research India Podcast

    Episode 011 | January 18, 2022

    Keratoconus is a severe eye disease that affects the cornea, causing it to become weak and develop a conical bulge. Keratoconus, if undiagnosed and entreated, can lead to partial or complete blindness in people affected by it. However, the equipment needed to diagnose keratoconus is expensive and non-portable, which makes early detection of keratoconus inaccessible to large populations in low and middle income countries. This makes it a leading cause for partial or complete blindness amongst such populations. Doctors from Sankara Eye Hospital, Bengaluru and researchers from Microsoft Research India have been working together to develop SmartKC, a low-cost and portable diagnostic system that can enable early detection and mitigation of keratoconus. Join us as we speak to Dr. Kaushik Murali from Sankara Eye Hospital and Dr. Mohit Jain from Microsoft Research India.

    Dr. Kaushik Murali, President Medical Administration, Quality & Education, Sankara Eye Foundation India (Sri Kanchi Kamakoti Medical Trust) which is among the largest structured community eye hospital network in India, (www.sankaraeye.com) with an objective of providing world class eye care with a social impact.

    A paediatric ophthalmologist, Dr. Kaushik has completed a General Management Programme and is an alumnus of Insead. He has done a course on Strategic Management of Non Profits at the Harvard Business School. He has been certified in infection control, risk management for health care and digital disruption. He is a member of Scalabl, a global community promoting entrepreneurship.

    Dr. Kaushik is a member of the Scientific Committee of Vision 2020, the Right to Sight India. He is currently involved in collaborative research projects among others with the University of Bonn & Microsoft.

    Dr. Kaushik has received many recognitions, key among them being the Bernadotte Foundation for Children's Eyecare Travel Grant, Mother Teresa Social Leadership Scholarship ,International Eye Health Hero, All India Ophthalmological Society best research, International Association for Prevention of Blindness (IAPB) Eye Health Hero, Indian Journal of Ophthalmology Certificate of Merit.

    Beyond the medical world, he is part of the National Management Team of Young Indians – Confederation of Indian Industry (CII). He represented India at G20 Young Entrepreneur Alliance 2018 at Argentina and led the Indian delegation for the Inaugural India- Israel Young Leaders Forum in 2019. More recently, he led the first citizen’s cohort for a workshop on Strategic Leadership at LBSNAA (Lal Bahadur Shastri National Academy of Administration).

    Mohit Jain is a Senior Researcher in the Technology and Empowerment (TEM) group at Microsoft Research India. His research interests lie at the intersection of Human Computer Interaction and Artificial Intelligence. Currently, he focuses on developing end-to-end systems providing low-cost smartphone-based patient diagnostic solutions for critical diseases. Over the past decade, he has worked on technological solutions for the developing regions focusing on health, accessibility, education, sustainability, and agriculture.

    He received his Ph.D. in Computer Science & Engineering from the University of Washington, focusing on extending interactivity, accessibility and security of conversational systems. While pursuing his Ph.D., he also worked as a Senior Research Engineer in the Cognitive IoT team at IBM Research India. Prior to that, he graduated with a Masters in Computer Science from the University of Toronto, and a Bachelors in Information and Communication Technology from DA-IICT.

    For more information about the SmartKC project, and for project related code, click here.

    For more information about the Microsoft Research India click here.

    Related

    Microsoft Research India Podcast: More podcasts from MSR IndiaiTunes: Subscribe and listen to new podcasts on iTunesAndroidRSS FeedSpotifyGoogle PodcastsEmail

    Transcript

    Dr. Murali Kaushik: Sitting in an eye hospital, often we have ideas, but we have no clue whom to ask. But honestly, now we know that there is a team at MSR that we can reach out to saying that hey, here is a problem, we think this warrants attention. Do you think you guys can solve it? And we found that works really well. So, this kind of a collaboration is, I think, a phenomenal impact that this project has brought together, and we hope that together we will be able to come up with few more solutions that can align with our founders’ dream of eliminating needless blindness from India.

    [Music]

    Sridhar Vedantham: Welcome to the Microsoft Research India podcast, where we explore cutting-edge research that’s impacting technology and society. I’m your host, Sridhar Vedantham.

    [Music]

    Sridhar Vedantham: Keratoconus is a severe eye disease that affects the cornea, causing it to become weak and develop a conical bulge. Keratoconus, if undiagnosed and entreated, can lead to partial or complete blindness in people affected by it. However, the equipment needed to diagnose keratoconus is expensive and non-portable, which makes early detection of keratoconus inaccessible to large populations in low and middle income countries. This makes it a leading cause for partial or complete blindness amongst such populations. Doctors from Sankara Eye Hospital, Bengaluru and researchers from Microsoft Research India have been working together to develop SmartKC, a low-cost and portable diagnostic system that can enable early detection and mitigation of keratoconus. Join us as we speak to Dr. Kaushik Murali from Sankara Eye Hospital and Dr. Mohit Jain from Microsoft Research India.

    [Music]

    Sridhar Vedantham: So, Dr. Kaushik and Mohit, welcome to the podcast.

    Mohit Jain: Hi, Sridhar.

    Dr. Kaushik Murali: Hi Sridhar, pleasure to be here.

    Sridhar Vedantham: It's our pleasure to host you, Doctor Kaushik, and for me this is going to be a really interesting podcast for a couple of reasons. One is that the topic itself is kind of so far afield from what I normally here at Microsoft Research and the second is I think you're the first guest we are having on the podcast who's actually not part of MSR, so basically a collaborator. So, this is really exciting for me. So let me jump right into this. We're going to be talking about something called keratoconus, so could you educate us a little bit as to what keratoconus actually is and what its impact is?

    Dr. Kaushik Murali: So, imagine that you were a 14-year-old who was essentially near sighted. You wore glasses and you were able to see. But with passing time, your vision became more distorted rather than being blurred, which is what you would have expected if just your minus power kept increasing, especially for distance. And to add to your misery, you started seeing more glare and more halos at nighttime. Words that you started to read had shadows around them or even started to look doubled. This essentially is the world of a person with Keratoconus. Literally it means cone shaped. Keratoconus is a condition of the cornea, which is the transparent front part of the eye, similar to your watch glass, where instead of it normally retaining its dome shape, it is characterized by progressive thinning and weakening of the central part, what we call as a stroma, and this makes the cornea take on a conical shape. In a few, this can actually even progress beyond what I describe, where the central cornea overtime becomes scarred and the person could no longer be corrected, with just optical devices like a glass or a contact lens but may actually end up requiring a corneal transplant.

    Sridhar Vedantham: I see, and what are the causes for this?

    Dr. Kaushik Murali: So there have been very many causes that have been attributed, so it's thought to be multifactorial. So, this again makes it a little tricky in terms of us not being able to prevent the condition, so to speak. But multiple risk factors are known. Ultraviolet exposure, chronic allergies; habitual eye rubber is thought to be more prone for this. Essentially, you end up seeing it more during the pubertal age group, and more in men.

    Sridhar Vedantham: I see. And how widespread is this problem, really? Because frankly, I'm of course as lay a person as you can get, and I hadn't really heard of eye disease called keratoconus until I spoke to Mohit at some point and then of course after reading papers and so on. But what is the extent of the issue and is it really that widespread a problem?

    Dr. Kaushik Murali: So, unlike most other conditions, there is no real population-based survey where we have screened every household to arrive at numbers. But largely, we base our estimation on small surveys that have been done across different parts of the world. Based on this, we estimate that it is approximately affecting about one in 2000 individuals. So, in the US, for example, it is thought to affect almost about 55 people in about 100,000, who had been diagnosed with keratoconus. But in countries like India, it is thought to be more widespread. So there was actually a survey in central India where they found almost 2300 people out of 100,000 people being affected with keratoconus. So, the numbers are quite large. And again, all of this could be underestimated simply because we don't have enough ability to screen. And what makes this number even scarier is this is a disease that typically affects people between the age group of 10 and 25. So, once they're affected and they’re progressively going to have their vision come down, they're going to spend most of their protective years not being able to see clearly.

    Sridhar Vedantham: OK, that is kind of scary.

    Mohit Jain: I would just like to add to that is that there is actually a combination of demographics, genetic and weather condition which makes India a really good host for this disease. So, apparently Indian population tend to have a thinner and steeper cornea to begin with and moreover the hot and humid climate of India actually contribute towards keratoconus because it causes irritation which leads to frequent rubbing of the eye and that can actually start the process of distortion of the cornea.

    Sridhar Vedantham: OK, so that just made the problem sound a little scarier because there are these conditions that cannot be altered, right? I mean climate and stuff like that we can't really change. Uh, OK, doctor, so, this is basically a well-established and recognized disease, right? So, what are the barriers and hurdles actually to effective diagnosis and treatment of this disease?

    Dr. Kaushik Murali: So, in any health intervention the barriers typically are availability, accessibility and affordability. And when you look at a condition like keratoconus, all these barriers actually again come into play. So, the clinical gold standard for diagnosing keratoconus essentially entails us being able to map the curvature of the corneal surface using a technique known as corneal topography. So, here we either use something called as a Placido base, where you project a disc kind of structure onto the cornea and capture an image using tomographers, or you use certain other technology to map out the surface of the cornea, both the anterior and the posterior surface. But all these devices, by and large are non-portable and are expensive. And these are typically available again at secondary or tertiary eye hospitals. India is a land of diversity. We have some of the best health care in cities like Bangalore where we are sitting and doing this recording while you move about 150 to 200 kilometers away, it's a completely different world out there. The cost of each of these tests again, makes this diagnosis very inaccessible to large sections of population, not only in India but in similar middle and low income countries. And to add to this, you have the bogey of the pandemic. So, with COVID-19 in play the last two years with restrictions on travel, it has become very difficult for young children to actually undergo annual eye exams where we could have even proactively tried to pick up some of these conditions.

    Sridhar Vedantham: OK, alright, Mohit, let me bring you in here. I'm curious as to what Microsoft Research is doing as part of this particular research collaboration, right? Because keratoconus sounds very, very far afield from Microsoft Research’s typical computer science focus.

    Mohit Jain: So, I think one of the key pillars of MSR India is societal impact. So, if we can develop a diagnostic tool which can save even a handful of children getting blind, we think it has huge societal impact and Microsoft Research India completely, I think, supports such kind of technological solutions. With respect to the computer science research lab, we have developed a solution which actually uses cutting edge AI, especially in the field of image processing and then also we have developed a full end to end system. So hence there are like enough computer science research problem, really hard ones, that we actually solve as part of this research collaboration.

    Sridhar Vedantham: OK, so that's a good lead into me to ask you to talk a little bit about the solution that's been developed here.

    Mohit Jain: So, I think the solution has actually 3 core components. So the first component is actually a Placido disc like what Dr. Kaushik said that like there has to be something which projects concentric rings over the cornea. So, we actually 3D printed a Placido disc. So, we actually attach that Placido to the back of a smartphone camera and then capture an image of the cornea with the Placido projected over the cornea of a human. And the second component we have is actually a smart phone app which has inbuilt AI in it. So, in real time it actually helps the person who is capturing the image to actually get a perfect image because I think one of the core design principles that we had while working on this was to make sure that anyone can use this app to actually diagnose. We don't want like medical technician to be only be able to use it. So, there is some kind of AI assistance to help capture perfect image of the eye. And the third then we have like an image processing pipeline which takes this captured image as input and converts that into corneal topography heat maps. So, basically it gives a curvature heat map, which tells you like what is the curvature at each point of the cornea, and that is the kind of output that you also get from a medical grade corneal topographer.

    Sridhar Vedantham: So, the way you explain it sounds very simple, but I'm sure there were a bunch of challenges, while you know, both Sankara and MSR was doing the research. Can you give me a sense of what kind of challenges you faced?

    Mohit Jain: Yes, yes, I think the most trickiest part of this project was to actually do it in a portable setting. So right now the corneal topographer, that is like a $10,000 device which is there in Sankara hospital. There is actually a headrest and a chin rest. So, basically your whole face like the patient face is very very stable and hence the image capture process is fairly easy. Apart from that the topographer has a lot of specific hardware. For example, it has a specific hardware to determine how far is the eye from the camera, so which is actually called as a working distance, and getting that parameter right is very crucial. Like even a few millimeters of actually predicting that value wrong can completely change the generated heat map. So, we have to actually design a very innovative solution to figure out this working distance directly just from the captured image and the second part was that, like we actually did a lot of iteration on the Placido disc and also on the AI assistance which was actually running on the smart phone to actually help capture the best image, even without any kind of a support system in place, like without any headrest or chin rest.

    Sridhar Vedantham: Dr. Kaushik was there anything you wanted to add to that in terms of challenges from a medical point of view?

    Dr. Kaushik Murali: So, from the medical point of view, we are people of habit. When I say that, there is certain things that we are comfortable about and certain things that puts us off. So, one of the key challenges we gave to the MSR team was saying that the interface and reports had to be similar to what we were used to seeing. So, I think the challenge also came to the team in terms of ensuring that the heat maps were similar to the heat maps that we were used to seeing using a regular topographer and how we actually were able to match with it. So, the minute we were able to get that kind of familiarity built in, we found our doctors also being able to accept it much better. And once that was done then automatically the learning curve that came in in terms of using the device or interpreting the images came down very very fast. So, we were able to adapt to this much faster. We were even able to get some of our colleagues from other units to read these heat maps that we generated, just to validate it. Because we were also concerned saying that when you are putting it out as a screening device you should end up overestimating the disease. Because there is an indirect cost to a person and imagine the psychological impact to a parent if you tell him your young child may have a possible problem. So we didn't want to do that, so we were very finicky about the validation. So, it went through multiple iterations almost to the effect that Mohit and his team could have lunch on a Thursday, only after they finished a call with us.

    (Laughs)

    Mohit Jain: To add to that, this is actually a very crucial point. Initially what we were thinking, so there are like these two values called as sensitivity and specificity. So just to give you some context here, sensitivity is actually if the person has keratoconus, are we able to diagnose it as keratoconus. So that's the sensitivity. Specificity is that if the person does not have Keratoconus are we actually even diagnosing that correctly? So, we were thinking that we need to get really high sensitivity. We cannot leave anyone who has keratoconus undiagnosed. But we can have low specificity that even someone who does not have keratoconus, we can say that he or she has keratoconus and still it's OK because then he or she will actually go to a doctor and get checked up, maybe for a second time and then it will be diagnosed that he or she does not have keratoconus. But Dr. Kaushik specifically was really against that. OK, so he actually made our problem statement harder, saying that we want both sensitivity and specificity to be above 92%, OK. Because he does not want like parents to be unnecessarily worrying that their kid who is like in his teens right now and he has like this very serious eye condition. So hence we actually have to like, as what Dr. Kaushik said, like you have to go through multiple iteration to even like get the specificity right. And right now, I think we are at a stage where we have like both the numbers which are above 94% for a small trial that we did with Sankara, with more than 100 patients, and in future we plan to actually extend that trial and like do it in many more patients.

    [Music]

    Sridhar Vedantham: A question for both you, Mohit, as well as Doctor Kaushik, so, whoever wants to answer it. You know, you just put out some numbers there in terms of percentages, right, to say that this solution works well, but typically these need to be tagged against some kind of industry benchmark or some kind of medical benchmark given the established machines and equipment already that are already being used, right? So how does this SmartKC stack up against those?

    Dr. Kaushik Murali: So once the MSR team had come up with a complete solution and they had tested in their lab the reliability of the solution, so to speak, with the images that they had with them, we then apply to our Ethics Committee for a prospective study. So, we enrolled subjects coming to our hospital, and we had them get their cornea image with the SmartKC as well as with the Placido based topography system that we have in our hospital that we would typically have used in any case. One index that the device that we use in the hospital uses to identify keratoconus, is an index called as the percentage probability of keratoconus, or the PPK value. This if it is more than 45% is supposed to indicate the presence of keratoconus. So, what we found was with the SmartKC, the sensitivity of PPK value was 94.1% and the specificity was 100%, which was pretty comparable to what we had with our other device which stood at about 94.1% sensitivity and 95.5% specificity. More importantly, whenever you use any device as a screening tool is how repeatable is it, and how is the inter-rater agreement. If two different people use the same device on the same person, are they actually reading it out the same. So again, in this we found those indices to be really, really high. So, there's something called as the Cohen’s Kappa value. This was about 0.94 for our tool, which compared to 0.96 for the device that we have in the hospital. This essentially indicates a fair agreement between two experts diagnosing keratoconus based on the images that we are generating using the SmartKC.

    Sridhar Vedantham: Wow, that's impressive. Uh, I had another question. You know, whenever I went into an eye hospital, there always seemed to be somebody who is well trained as a technician operating various machines. How does it work with this? I mean do you need any specific training on SmartKC? How intensive is that and are there any particular type of people you look for or some kind of skill level in people who can operate a SmartKC unit?

    Dr. Kaushik Murali: So, currently most of the data that was collected has been done by our optometry interns at the Sankara College of Optometry, so the skill set that was required was not very high. Like Mohit mentioned earlier, there is some assistant in terms of even how you can capture an image. So, the eventual endpoint for us would be this being used at a peripheral vision center or at some Primary Health Center as a part of a school screening program where a lay person can pick up this device and actually capture an image and at least be able to counsel the person about them having to have a further examination. So, the training that would probably be required is largely in terms of what the condition that they are screening for and what the next action needs to be. it's more counseling skill I would say rather than anything really technical. The machine does at all. It gives them a heat map, it tells them what the value is, it kind of literally puts up a red flag saying “Boss, be careful, there is a problem here”.

    Sridhar Vedantham: So, I'm guessing that's all part of the technology that's been built into it, say you know various guardrails to ensure that whoever is operating it is doing it in the right way, Mohit?

    Mohit Jain: Yes, yes Sridhar. So, basically in the latest version of the app, what we have done is that like the operator, whoever is actually capturing the image, he or she doesn't even have to click the capture button. So, you just have to place the SmartKC on a patient eye and it automatically tries to align it appropriately with the eye and once the alignment is right, once the image does not have any kind of a blur and the focus is appropriate, the light is fine, it automatically captures three images. So, that actually makes the process really easy for the operator. And he or she needs to go through very minimal training to get up and running with the device.

    Sridhar Vedantham: Nice. And you know we have referred earlier to the cost of current machines that are there that are used in eye hospitals and so on, right? And one of the guiding principles behind developing SmartKC was for something to be portable, inexpensive and easy to use in the field. One thing that I don't think we've spoken about yet is actually how inexpensive SmartKC is, and I'm also curious to find out whether the equipment that you use needs to be high end. For example, do you need a high-end mobile phone to do it and how does the whole system work?

    Mohit Jain: Yes, so currently for developing our research prototype, we actually end up spending like almost like $33 is the amount that we are end up spending making the device, apart from the smartphone. The smartphone that we are using is a slightly high end one, so it is around like a $400 device that we have used for the current prototype. But moving ahead we are actually for the next round of data collection with Sankar Hospital, we are actually trying out with like three different devices and starting from like $100 device, $200 device and the $300 device, so that we can actually see that with whether it works on all the devices. However, based on our current work, we hypothesize that, actually, it can work on any standard smartphone. It just needs a decent camera. By the way, nowadays even the $100 device have like a 20-megapixel camera. So, I think that's already taken care by the by the most of the latest smartphone. So, I think yeah, it should ideally work in all of them, but we will only know for sure once we have the second round of testing.

    Sridhar Vedantham: Cool. Uh, so you know, given that you've put in so much work and you've come up with something that sounds like a fantastic solution, what kind of impact do you think a SmartKC can have or what kind of impact do you hope SmartKC will have?

    Mohit Jain: So right now, actually, we have discussed a few use cases with Dr. Kaushik, and I think the most interesting use case is what Dr. Kaushik initially referred to is the teacher use case. The biggest impact SmartKC could have is that, let's say, all the teachers in India, even the rural one, urban one, semi urban or even like low-income community even in urban India they have access to SmartKC. And every year maybe twice a year or thrice a year they screen every children in their school for keratoconus. And because the numbers are really high, the numbers are like two to three children out of everyone hundred children will have keratoconus, so with that in mind, we should be able to get like a few cases from every school. So, if these are diagnosed very early on, then there is a very very high likelihood that they could have been treated just by simple glasses or contact lens, and they don't have to go through surgery or corneal transplant or blindness, which is the worst-case situation. By the way, I think earlier when we are talking about corneal transplant, so globally 27% of the corneal transplant happens to actually fix keratoconus. So it is, it is a very deadly disease.

    Dr. Kaushik Murali: So, I'm going to take off on a tangent. I'm actually going to look beyond just this particular solution. So, although I am here representing Sankara at this podcast, a couple of my colleagues actually did a deep dive into this entire solution. Dr. Anand and Dr. Pallavi were there, as well as some of our faculty from the College of Optometry. What this project has actually impacted us is to think of how we can actually leverage technology. Sitting in an eye hospital, often we have ideas, but we have no clue whom to ask. But honestly, now we know that there is a team at MSR that we can reach out to saying that hey, here is a problem, we think this warrants attention. Do you think you guys can solve it? And we found that works really well. So, this ability to collaborate between complete extremes of people, one end you have medical specialists have no clue about what engineering is. Today I think Mohit knows more optometry than even I do.

    (Laughs)

    So they actually borrowed textbooks from us to read. So, this kind of a collaboration is, I think, a phenomenal impact that this project has brought together, and we hope that together we will be able to come up with few more solutions that can align with our founders’ dream of eliminating needless blindness from India.

    Sridhar Vedantham: Nice, congratulations Mohit.

    Mohit Jain: Thank you. (Laughs).

    Sridhar Vedantham: I'll come to you for my next eye test.

    Dr. Kaushik Murali: You will soon start referring to him as Doctor Mohit.

    (Laughs)

    Sridhar Vedantham: All right. I think I've got all the information that I really wanted. But one thing is, you know, if people want to adopt SmartKC and they want to take it out there in the field, what do they need to do? And are there any particular requirements or criteria that they need to fulfil, etc? I mean, how do people get hold of this?

    Mohit Jain: So right now, we have actually made everything open source. So, even the 3D print for the Placido disc, the STL file of that is open source. So, anyone can download it and just like you take a printout, you can actually go to a makerspace and get a 3D print of that. Even the app which is actually AI assisted app which is running on the smartphone, we have only written it for an Android phone, so you can actually download it and you can install it on your Android phone and connect that SmartKC Placido attachment and can click an image of eye. The image processing pipelines code is also completely open source, so the selected image you can then run on it to actually generate corneal topography heat map. So, that is that is the current state. Going ahead we are actually putting everything on the cloud so that once you have the Placido disc and you capture an image, it automatically goes to the cloud and gives you the final output.

    Sridhar Vedantham: Excellent. So I will add links to various things onto the podcast page once we're done with this. Dr. Kaushik, any final thoughts before we close this podcast?

    Dr. Kaushik Murali: So quite often we look at challenges as challenges, but I think essentially this collaboration has looked at an opportunity and how we can actually work on it. All it required was to put some method to the madness and what came up as one discussion is today, I think, churned out into three different projects that are currently underway. So, this is something that is a potential. It would be lovely if similar technology companies can collaborate with medical institutions across India. We don't have to look elsewhere for solutions. It's all here. It's up to us to figure it out and run with it.

    Sridhar Vedantham: That's a great note to end the podcast on. So, once again thank you so much, Dr. Kaushik and Dr. Mohit. Thank you so much.

    (Laughs)

    Mohit Jain: Thank you, Sridhar.

    [Music ends]

  • Microsoft Research India Podcast – Podcast (8)
    Accelerating AI Innovation by Optimizing Infrastructure. With Dr. Muthian Sivathanu29 sep 2021· Microsoft Research India Podcast

    Episode 010 | September 28, 2021

    Artificial intelligence, Machine Learning, Deep Learning, and Deep Neural Networks are today critical to the success of many industries. But they are also extremely compute intensive and expensive to run in terms of both time and cost, and resource constraints can even slow down the pace of innovation. Join us as we speak to Muthian Sivathanu, Partner Research Manager at Microsoft Research India, about the work he and his colleagues are doing to enable optimal utilization of existing infrastructure to significantly reduce the cost of AI.

    Muthian's interests lie broadly in the space of large-scale distributed systems,storage, and systems for deep learning, blockchains, and information retrieval.

    Prior to joining Microsoft Research, he worked at Google for about 10 years, with a large part of the work focused on building key infrastructure powering Google web search — in particular, the query engine for web search. Muthian obtained his Ph.D from University of Wisconsin Madison in 2005in the area of file and storage systems, and a B.E. from CEG, Anna University, in 2000.

    For more information about the Microsoft Research India click here.

    Related

    Microsoft Research India Podcast: More podcasts from MSR IndiaiTunes: Subscribe and listen to new podcasts on iTunesAndroidRSS FeedSpotifyGoogle PodcastsEmail

    Transcript

    Muthian Sivathanu: Continued innovation in systems and efficiency and costs are going to be crucial to drive the next generation of AI advances, right. And the last 10 years have been huge for deep learning and AI and primary reason for that has been the significant advance in both hardware in terms of emergence of GPUs and so on, as well as software infrastructure to actually parallelize jobs,run large distributed jobs efficiently and so on. And if you think about the theory of deep learning, people knew about backpropagation about neural networks 25 years ago. And we largely use very similar techniques today. But why have they really taken off in the last 10 years? The main catalyst has been sort of advancement in systems. And if you look at the trajectory of current deep learning models, the rate at which they are growing larger and larger, systems innovation will continue to be the bottleneck in sort of determining the next generation of advancement in AI.

    [Music]

    Sridhar Vedantham: Welcome to the Microsoft Research India podcast, where we explore cutting-edge research that’s impacting technology and society. I’m your host, Sridhar Vedantham.

    [Music]

    Sridhar Vedantham: Artificial intelligence, Machine Learning, Deep Learning, and Deep Neural Networks are today critical to the success of many industries. But they are also extremely compute intensive and expensive to run in terms of both time and cost, and resource constraints can even slow down the pace of innovation. Join us as we speak to Muthian Sivathanu, Partner Research Manager at Microsoft Research India, about the work he and his colleagues are doing to enable optimal utilization of existing infrastructure to significantly reduce the cost of AI.

    [Music]

    Sridhar Vedantham: So Muthian, welcome to the podcast and thanks for making the time for this.

    Muthian Sivathanu: Thanks Sridhar, pleasure to be here.

    Sridhar Vedantham: And what I'm really looking forward to, given that we seem to be in some kind of final stages of the pandemic, is to actually be able to meet you face to face again after a long time.Unfortunately, we've had to again do a remote podcast which isn't all that much fun.

    Muthian Sivathanu: Right, right. Yeah, I'm looking forward to the time when we can actually do this again in office.

    Sridhar Vedantham: Yeah. Ok, so let me jump right into this. You know we keep hearing about things like AI and deep learning and deep neural networks and so on and so forth. What's very interesting in all of this is that we kind of tend to hear about the end product of all this, which is kind of, you know, what actually impacts businesses, what impacts consumers, what impacts the health care industry, for example, right, in terms of AI. It's a little bit of a mystery, I think to a lot of people as to how all this works, because... what goes on behind the scenes to actually make AI work is generally not talked about.

    Muthian Sivathanu: Yeah.

    Sridhar Vedantham: So, before we get into the meat of the podcast you just want to speak a little bit about what goes on in the background.

    Muthian Sivathanu: Sure. So, machine learning, Sridhar, as you know, and deep learning in particular, is essentially about learning patterns from data, right, and deep learning system is fed a lot of training examples,examples of input and output, and then it automatically learns a model that fits that data, right. And this is typically called the training phase. So, training phase is where it takes data builds a model how to fit. Now what is interesting is, once this model is built, which was really meant to fit the training data, the model is really good at answering queries on data that it had never seen before, and this is where it becomes useful. These models are built in various domains. It could be for recognizing an image for converting speech to text, and so on, right. And what has in particular happened over the last 10 or so years is that there has been significant advancement both on the theory side of machine learning, which is, new algorithms, new model structures that do a better job at fitting the input data to a generalizable model as well as rapid innovation in systems infrastructure which actually enable the model to sort of do its work, which is very compute intensive, in a way that's actually scalable that's actually feasible economically, cost effective and so on.

    Sridhar Vedantham: OK, Muthian, so it sounds like there's a lot of compute actually required to make things like AI and ML happen. Can you give me a sense of what kind of resources or how intensive the resource requirement is?

    Muthian Sivathanu: Yeah. So the resource usage in a machine learning model is a direct function of how many parameters it has, so the more complex the data set, the larger the model gets, and correspondingly requires more compute resources, right. To give you an idea, the early machine learning models which perform simple tasks like recognizing digits and so on, they could run on a single server machine in a few hours, but models now, just over the last two years, for example, the size of the largest model that's useful that state of the art, that achieves state of the art accuracy has grown by nearly three orders of magnitude, right. And what that means is today to train these models you need thousands and thousands of servers and that's infeasible. Also, accelerators or GPUs have really taken over the last 6-7 years and GPUs. A single V-100 GPU today, a Volta GPU from NVIDIA can run about 140 trillion operations per second. And you need several hundreds of them to actually train a model like this. And they run for months together to train a 175 billion model, which is called GPT 3 recently, you need on the order of thousands of such GPUs and it still takes a month.

    Sridhar Vedantham: A month, that's sounds like a humongous amount of time.

    Muthian Sivathanu: Exactly, right? So that's why I think just as I told you how the advance in the theory of machine learning in terms of new algorithms, new model structures, and so on have been crucial to the recent advance in the relevance in practical utility of deep learning.Equally important has been this advancement in systems, right, because given this huge explosion of compute demands that these workloads place, we need fundamental innovation in systems to actually keep pace, to actually make sure that you can train them in reasonable time, you can actually do that with reasonable cost.

    Sridhar Vedantham: Right. Ok, so you know for a long time, I was generally under the impression that if you wanted to run bigger and bigger models and bigger jobs, essentially you had to throw more hardware at it because at one point hardware was cheap. But I guess that kind of applies only to the CPU kind of scenario, whereas the GPU scenario tends to become really expensive, right?

    Muthian Sivathanu: Yep, yeah.

    Sridhar Vedantham: Ok, so in which case, when there is basically some kind of a limit being imposed because of the cost of GPUs, how does one actually go about tackling this problem of scale?

    Muthian Sivathanu: Yeah, so the high-level problem ends up being, you have limited resources, so let's say you can view this in two perspectives, right. One is from the perspective of a machine learning developer or a machine learning researcher, who wants to build a model to accomplish a particular task right. So, from the perspective of the user, there are two things you need. A, you want to iterate really fast, right, because deep learning, incidentally, is this special category of machine learning, where the exploration is largely by trial and error. So, if you want to know which model actually works which parameters, or which hyperparameter set actually gives you the best accuracy, the only way to really know for sure is to train the model to completion, measure accuracy, and then you would know which model is better, right. So, as you can see, the iteration time, the time to train a model to run inference on it directly impacts the rate of progress you can achieve. The second aspect that the machine learning researcher cares about is cost. You want to do it without spending a lot of dollar cost.

    Sridhar Vedantham: Right.

    Muthian Sivathanu: Now from the perspective of let's say a cloud provider who runs this, huge farm of GPUs and then offers this as a service for researchers, for users to run machine learning models, their objective function is cost, right. So, to support a given workload you need to support it with as minimal GPUs as possible. Or in other words, if you have a certain amount of GPU capacity, you want to maximize the utilization, the throughput you can get out of those GPUs, and that's where a lot of the work we've been doing at MSR has focused on. How do you sort of multiplex lots and lots of jobs onto a finite set of GPUs, while maximizing the throughput that you can get from them?

    Sridhar Vedantham: Right, so I know you and your team have been working on this problem for a while now. Do you want to share with us some of the key insights and some of the results that you've achieved so far, because it is interesting, right? Schedulers have been around for a while. It's not that there aren't schedulers, but essentially what you're saying is that the schedulers that exist do not really cut it, given the, intensity of the compute requirements as well as the jobs, as the size of the jobs and models that are being run today in terms of deep learning or even machine learning models, right?

    Muthian Sivathanu: That's right.

    Sridhar Vedantham: So, what are your, key insights and what are some of the results that you guys have achieved?

    Muthian Sivathanu: So, you raise a good point. I mean, schedulers for distributed systems have been around for decades, right. But what makes deep learning somewhat special is that it turns out, in contrast to traditional schedulers, which have to view a job as a black box, because they're meant to run arbitrary jobs. There is a limit to how efficient they can be. Whereas in deep learning, first of all because deep learning is such high impact area with lots, and I mean from an economic perspective, there are billions of dollars spent in these GPUs and so on. So, there is enough economic incentive to extract the last bit of performance out of these expensive GPUs, right. And that lends itself into this realm of- what if we co-design? What if we custom design a scheduler for the specific case of deep learning, right. And that's what we did in the Gandiva project which we published at OSDI in 2018. What we said was, instead of viewing a deep learning job as just another distributed job which is opaque to us, let's actually exploit some key characteristics that are unique to deep learning jobs, right? And one of those characteristics, is thatalthough, as I said, a single deep learning training job can run for days or even months, right, deep within it is actually composed of millions and millions of these what are called mini batches. So, what is a mini batch? A mini batch is an iteration in the training where it reads one set of input training examples, runs it through the model, and then back propagates the loss, and essentially, changes the parameters to fit that input. And this sequence this mini batch repeats over and over again across millions and millions of mini batches. And what makes it particularly interesting and relevant from a systems optimization viewpoint is that from a resource usage perspective and from a performance perspective, mini batches are identical. They may be operating on different data in each mini batch, but the computation they do is pretty much identical. And what that means is we can look at the job for a few mini batches and we can know what exactly is going to do for the rest of its life time, right. And that allows us to, for example, do things like, we can automatically decide which hardware generation is the best fit for this job, because you can just measure it in a whole bunch of hardware configurations. Or when you're distributing the job, you can compare it across a whole bunch of parallelism configurations, and you can automatically figure out, this is the right configuration, right hardware assignment for this particular job, which you couldn't do in an arbitrary job with a distributed scheduler because the job could be doing different things at different times. Like a MapReduce job for example, it would keep fluctuating across how we'd use a CPU, network, storage, and so on, right. Whereas with deep learning there is this remarkable repeatability and predictability, right. What it also allows us to do is, we can then look within a mini batch what happens, and it turns out, one of the things that happens is, if you look at the memory usage, how much GPU memory the training loop itself is consuming, somewhere at the middle of a mini batch, the memory peaks to almost fill the entire GPU memory, right. And then by the time the mini batch ends, the memory usage drops down by like a factor of anywhere between 10 to 50x. Right, and so there is this sawtooth pattern in the memory usage, and so one of the things we did in Gandiva was proposed this mechanism of transparently migrating a job, so you should be able to, on demand checkpoint a job. The scheduler should be able to do it and just move it to a different machine, maybe even essentially different GPU, different machine, and so on, right. And this is very powerful from load balancing. Lots of scheduling things become easy if you do this. Now, when you're doing that, when you are actually moving a job from one machine to another, it helps if the amount of state you need to move is small, right. And so that's where this awareness of mini batch boundaries and so on helps us, because now you can choose when exactly to move it so that you move 50x, smaller amount of state.

    Sridhar Vedantham: Right. Very interesting, and another part of this whole thing about resources and compute and all that is, I think, the demands on storage itself, right?

    Muthian Sivathanu: Yeah.

    Sridhar Vedantham: Because if the models are that big, that you need some really high-powered GPUs to compute, how do you manage the storage requirements?

    Muthian Sivathanu: Right, right. So, it turns out the biggest requirement from storage that deep learning poses is on the throughput that you need from storage, right. So, as I mentioned, because GPUs are the most expensive resource in this whole infrastructure stack, the single most important objective is to keep GPUs busy all the time, right. You don't want them idling, at all. What that means is the input training data that the model needs in order to run its mini batches, that is to be fed to it at a rate that is sufficient to keep the GPUs busy. And GPUs process, I mean the amount of data that the GPU can process from a compute perspective has been growing at a very rapid pace, right. And so, what that means is, you know, when between Volta series and an Ampere series, for example, of GPUs there is like 3X improvement in compute speed, right. Now that means the storage bandwidth should keep up with that pace, otherwise faster GPU doesn't help. It will be stalling on IO. So, in that context one of the systems we built was the system called Quiver, where we say a traditional remote storage system like the standard model for running this training is...the datasets are large- I mean the data sets can be in terabytes, so, you place it on some remote cloud storage system, like Azure blob or something like that, and you read it remotely from whichever machine does the training, right. And that bandwidth simply doesn't cut it because it goes through network backbone switches and so on, and it becomes insanely expensive to sustain that level of bandwidth from a traditional cloud storage system, right. So what we need, to achieve here is hyper locality. So, ideally the data should reside on the exact machine that runs the training, then it's a local read and it has to reside on SSD and so on, right. So, you need several gigabytes per second read bandwidth.

    Sridhar Vedantham: And this is to reduce network latency?

    Muthian Sivathanu: Yes, this is to reduce network latency and congestion, like when it goes through lots of back end, like T1 switches, T2 switches etc. The end-to-end throughput that you get across the network is not as much as what you can get locally, right?

    Sridhar Vedantham: Right.

    Muthian Sivathanu: So, ideally you want to keep the data local in the same machine, but as I said, for some of these models, the data set can be in tens of terabytes. So, what we really need is a distributed cache, so to speak, right, but a cache that is locality aware. So, what we have is a mechanism by which, within each locality domain like a rack for example, we have a copy of the entire training data, so, a rack could comprise maybe 20 or 30 machines, so across them you can still fit the training data and then you do peer to peer across machines in the rack for the access to the cache. And within a rack, network bandwidth is not a limitation. You can get nearly the same performance as you could from local SSD, so that's what we did in Quiver andthere are a bunch of challenges here, because if every model wants the entire training data to be local to be within the rack, then there is just no cache space for keeping all of that.

    Sridhar Vedantham: Right.

    Muthian Sivathanu: Right. So we have this mechanism by which we can transparently share the cache across multiple jobs, or even multiple users without compromising security, right. And we do that by sort of intelligent content addressing of the cache entries so that even though two users may be accessing different copies of the same data internally in the cache, they will refer to the same instance.

    Sridhar Vedantham: Right, I was actually just going to ask you that question about how do you maintain security of data, given that you're talking about distributed caching, right? Because it's very possible that multiuser jobs will be running simultaneously, but that's good, you answered it yourself. So, you know I've heard you speak a lot about things like micro design and so on. How do you bring those principles to bear in these kind of projects here?

    Muthian Sivathanu: Right, right. So, I alluded to this a little bit in one of my earlier points, which is the interface, I mean, if you look at a traditional scheduler which we use the job as a black box, right. That is an example of traditional philosophy to system design, where you build each layer independent of the layer above or below it, right, so that, there are good reasons to do it because you know, like multiple use cases can use the same underlying infrastructure, like if you look at an operating system, it's built to run any process, whether it is Office or a browser or whatever, right.

    Sridhar Vedantham: Right.

    Muthian Sivathanu: But, in workloads like deep learning, which place particularly high demands on compute and that are super expensive and so on, there is benefit to sort of relaxing this tight layering to some extent, right. So that's the philosophy we take in Gandiva, for example, where we say the scheduler no longer needs to think of it as a black box, it can make use of internal knowledge. It can know what mini batch boundaries are. It can know that mini batch times are repeatable and stuff like that, right. So, co-design is a philosophy that has been gaining traction over the last several years, and people typically refer to hardware, software co-design for example. What we do in micro co-design is sort of take a more pragmatic view to co-design where we say look, it's not always possible to rebuild entire software layers from scratch to make them more tightly coupled, but the reality is in existing large systems we have these software stacks, infrastructure stacks, and what can we do without rocking the ship, without essentially throwing away everything in building everything from a clean slate. So, what we do is very surgical, carefully thought through interface changes, that allow us to expose more information from one layer to another, and then we also introduce some control points which allow one layer to control. For example, the scheduler can have a control point to ask a job to suspend. And it turns out by opening up those carefully thought through interface points, you leave the bulk of the infrastructure unchanged, but yet achieve these efficiencies that result from richer information and richer control, right. So, micro co-design is something we have been adopting, not only in Gandiva and Quiver, but in several other projects in MSR. And MICRO stands for Minimally Invasive Cheap and Retrofittable Co-design. So, it's a more pragmatic view to co-design in the context of large cloud infrastructures.

    Sridhar Vedantham: Right, where you can do the co-design with the minimum disruption to the existing systems.

    Muthian Sivathanu: That's right.

    Sridhar Vedantham: Excellent.

    [Music]

    Sridhar Vedantham: We have spoken a lot about the work that you've been doing and it's quite impressive. Do you have some numbers in terms of you know, how jobs will run faster or savings of any nature, do you have any numbers that you can share with us?

    Muthian Sivathanu: Yeah, sure. So the numbers, as always depend on the workload and several aspects. But I can give you some examples. So, in the Gandiva work that we did. We, introduce this ability to time slice jobs, right. So, the idea is, today when you launch a job in a GPU machine, that job essentially holds on to that machine until it completes, and until that time it has exclusive possession of that GPU, no other job can use it, right. And this is not ideal in several scenarios. You know, one classic example is hyperparameter tuning, where you have a model and you need to decide what exact hyperparameter values like learning rate, etc. actually are the best fit and give the best accuracy for this model. So, people typically do what is called the hyperparameter search where you run maybe 100 instances of the model, see how it's doing, maybe kill some instances spawn of new instances, and so on, right. And hyperparameter exploration really benefits from parallelism. You want to run all these instances at the same time so that you have an apples-to-apples comparison of how they are doing. And if you want to run like 100 configurations and you have only 10 GPUs, that significantly slows down hyperparameter exploration- it serializes it, right. What Gandiva has is an ability to perform fine grained time slicing of the same GPU across multiple jobs, just like how an operating system time slices multiple processes, multiple programs on the same CPU, we do the same in GPU context, right. And because we make use of mini batch boundaries and so on, we can do this very efficiently. And with that we showed that for typical hyperparameter tuning, we can sort of speed up the end-to-end time to accuracy by nearly 5-6x, right. Uh, and so this is one example ofhow time slicing can help. We also saw that from a cluster wide utilization perspective, some of the techniques that Gandiva adopted can improve overall cluster utilization by 20-30%. Right, and this directly translates to cost incurred to the cloud provider running those GPS because it means with the same GPU capacity, I can serve 30% more workload or vice versa, right, for a given workload I only need 30% lesser number of GPUs.

    Sridhar Vedantham: Yeah, I mean those savings sound huge and I think you're also therefore talking about reducing the cost of AI making the process of AI itself more efficient.

    Muthian Sivathanu: That's correct, that's correct. So, the more we are able to extract performance out of the same infrastructure, the cost per model or the cost per user goes down and so the cost of AI reduces and for large companies like Microsoft or Google, which have first party products that require deep learning, like search and office and so on, it reduces the capital expenditure running such clusters to support those workloads.

    Sridhar Vedantham

    Right.

    Muthian Sivathanu: And we've also been thinking about areas such as, today there is this limitation that large models need to run in really tightly coupled hyperclusters which are connected via InfiniBand and so on. And that brings up another dimension of cost escalation to the equation, because these are sparse, the networking itself is expensive, there is fragmentation across hyperclusters and so on. What we showed in some recent work is how can you actually run training of large models in just commodity VMs-these are just commodity GPU VMs- but without any requirement on them being part of the same InfiniBand cluster or hypercluster, but just they can be scattered anywhere in the data center, and more interestingly, we can actually run these off of spot VMs. So Azure, AWS, all cloud providers provide these bursty VMs or low priority VMs, which is away essentially for them to sell spare capacity, right. So, you get them at a significant discount. Maybe 5-10x cheaper price. And the disadvantage, I mean the downside of that is they can go away at any time. They can be preempted when real demand shows up. So, what we showed is it's possible to train such massive models at the same performance, despite these being on spot VMs and spread over a commodity network without custom InfiniBand and so on. So that's another example how you can bring down the cost of AI by reducing constraints on what hardware you need.

    Sridhar Vedantham: Muthian, we're kind of reaching the end of the podcast, and is there anything that you want to leave the listeners with, based on your insights and learning from the work that you've been doing?

    Muthian Sivathanu: Yeah, so taking a step back, right? I think continued innovation in systems and efficiency and costs are going to be crucial to drive the next generation of AI advances, right. And the last 10 years have been huge for deep learning and AI and primary reason for that has been the significant advance in both hardware in terms of emergence of GPUs and so on, as well as software infrastructure to actually parallelize jobs,run large distributed jobs efficiently and so on. And if you think about the theory of deep learning, people knew about backpropagation about neural networks 25 years ago. And we largely use very similar techniques today. But why have they really taken off in the last 10 years? The main catalyst has been sort of advancement in systems. And if you look at the trajectory of current deep learning models, the rate at which they are growing larger and larger, systems innovation will continue to be the bottleneck in sort of determining the next generation of advancement in AI.

    Sridhar Vedantham: Ok Muthian, I know that we're kind of running out of time now but thank you so much. This has been a fascinating conversation.

    Muthian Sivathanu: Thanks Sridhar, it was a pleasure.

    Sridhar Vedantham: Thank you

  • Microsoft Research India Podcast – Podcast (9)
    Dependable IoT: Making data from IoT devices dependable and trustworthy for good decision making. With Dr. Akshay Nambi and Ajay Manchepalli14 jun 2021· Microsoft Research India Podcast

    Episode 009 | June 15, 2021

    The Internet of Things has been around for a few years now and many businesses and organizations depend on data from these systems to make critical decisions. At the same time, it is also well recognized that this data- even up to 40% of it- can be spurious, and this obviously can have a tremendously negative impact on an organizations’ decision making. But is there a way to evaluate if the sensors in a network are actually working properly and that the data generated by them are above a defined quality threshold? Join us as we speak to Dr Akshay Nambi and Ajay Manchepalli, both from Microsoft Research India, about their innovative work on making sure that IoT data is dependable and verified, truly enabling organizations to make the right decisions.

    Akshay Nambi is a Senior Researcher at Microsoft Research India. His research interests lie at the intersection of Systems and Technology for Emerging Markets broadly in the areas of AI, IoT, and Edge Computing. He is particularly interested in building affordable, reliable, and scalable IoT devices to address various societal challenges. His recent projects are focused on improving data quality in low-cost IoT sensors and enhancing performance of DNNs on resource-constrained edge devices. Previously, he spent two years at Microsoft Research as a post-doctoral scholar and he has completed his PhD from the Delft University of Technology (TUDelft) in the Netherlands.

    Ajay Manchepalli, as a Research Program Manager, works with researchers across Microsoft Research India, bridging Research innovations to real-world scenarios. He received his Master’s degree in Computer Science from Temple University where he focused on Database Systems. After his Masters, Ajay spent his next 10 years shipping SQL Server products and managing their early adopter customer programs.

    For more information about the Microsoft Research India click here.

    Related

    Microsoft Research India Podcast: More podcasts from MSR IndiaiTunes: Subscribe and listen to new podcasts on iTunesAndroidRSS FeedSpotifyGoogle PodcastsEmail

    Transcript

    Ajay Manchepalli: The interesting thing that we observed in all these scenarios is how the entire industry is trusting data, and using this data to make business decisions, and they don't have a reliable way to say whether the data is valid or not. That was mind boggling. You're calling data as the new oil, we are deploying these things, and we're collecting the data and making business decisions, and you're not even sure if that data that you've made your decision on is valid. To us it came as a surprise that there wasn't enough already done to solve these challenges and that in some sense was the inspiration to go figure out what it is that we can do to empower these people, because at the end of the day, your decision is only as good as the data.

    [Music]

    Sridhar Vedantham: Welcome to the Microsoft Research India podcast, where we explore cutting-edge research that’s impacting technology and society. I’m your host, Sridhar Vedantham.

    [Music]

    The Internet of Things has been around for a few years now and many businesses and organizations depend on data from these systems to make critical decisions. At the same time, it is also well recognized that this data- even up to 40% of it- can be spurious, and this obviously can have a tremendously negative impact on an organizations’ decision making. But is there a way to evaluate if the sensors in a network are actually working properly and that the data generated by them are above a defined quality threshold? Join us as we speak to Dr Akshay Nambi and Ajay Manchepalli, both from Microsoft Research India, about their innovative work on making sure that IoT data is dependable and verified, truly enabling organizations to make the right decisions.

    [Music]

    Sridhar Vedantham: So, Akshay and Ajay, welcome to the podcast. It's great to have you guys here.

    Akshay Nambi: Good evening Sridhar. Thank you for having me here.

    Ajay Manchepalli: Oh, I'm excited as well.

    Sridhar Vedantham: Cool, and I'm really keen to get this underway because this is a topic that's quite interesting to everybody, you know. When we talk about things like IoT in particular, this has been a term that's been around for quite a while, for many years now and we've heard a lot about the benefits that IoT can bring to us as a society or as a community, or as people at an individual level. Now you guys have been talking about something called Dependable IoT. So, what exactly is Dependable IoT and what does it bring to the IoT space?

    Ajay Manchepalli: Yeah, IoT is one area we have seen that is exponentially growing. I mean, if you look at the number of devices that are being deployed it's going into the billions and most of the industries are now relying on this data to make their business decisions. And so, when they go about doing this, we have, with our own experience, we have seen that there are a lot of challenges that comes in play when you're dealing with IoT devices. These are deployed in far off locations, remote locations and in harsh weather conditions, and all of these things can lead to reliability issues with these devices. In fact, the CTO of GE Digital mentioned that, you know, about 40% of all the data they see from these IoT devices are spurious, and even KPMG had a report saying that you know over 80% of CEOs are concerned about the quality of data that they're basing their decisions on.

    And we observed that in our own deployments early on, and that's when we realized that there is, there is a fundamental requirement to ensure that the data that is being collected is actually good data, because all these decisions are being based on the data. And since data is the new oil, we are basically focusing on, ok, what is it that we can do to help these businesses know whether the data they're consuming is valid or not and that starts at the source of the truth, which is the sensors and the sensor devices. And so Akshay has built this technology that enables you to understand whether the sensors are working fine or not.

    Sridhar Vedantham: So, 40% of data coming from sensors being spurious sounds a little frightening, especially when we are saying that you know businesses and other organizations base a whole lot of the decisions on the data they're getting, right?

    Ajay Manchepalli: Absolutely.

    Sridhar Vedantham: Akshay, was there anything you wanted to add to this?

    Akshay Nambi: Yeah, so if you see, reliability and security are the two big barriers in limiting the true potential of IoT, right? And over the past few years you would have seen IoT community, including Microsoft, made significant progress to improve security aspects of IoT. However, techniques to determine data quality and sensor health remain quite limited. Like security, sensor reliability and data quality are fundamental to realize the true potential of IoT which is the focus of our project- Dependable IoT.

    Sridhar Vedantham: Ok, so you know, once again, we've heard these terms like IoT for many years now. Just to kind of demonstrate what the two of you have been speaking about in terms of various aspects or various scenarios in which IoT can be deployed, could you give me a couple of examples where IoT use is widespread?

    Akshay Nambi: Right, so let me give an example of air pollution monitoring. So, air pollution is a major concern worldwide, and governments are looking for ways to collect fine grained data to identify and curb pollution. So, to do this, low-cost sensors are being used to monitor pollution levels. There have been deployed in numerous places on moving vehicles to capture the pollution levels accurately. The challenge with these sensors are that these are prone to failures, mainly due to the harsh environments in which they are deployed.

    For example, imagine a pollution sensor is measuring high pollution values right at a particular location. And given air pollution is such a local phenomenon, it's impossible to tell if this sensor data is an anomaly or a valid data without having any additional contextual information or sensor redundancy. And due to these reliability challenges the validity and viability of these low-cost sensors have been questioned by various users.

    Sridhar Vedantham: Ok, so it sounds kind of strange to me that sensors are being deployed all over the place now and you know, frankly, we all carry sensors on ourselves, right, all the time. Our phones have multiple sensors built into them and so on. But when you talk about sensors breaking down or being faulty or not providing the right kind of data back to the users, what causes these kind of things? I mean, I know you said in the context of, say, air pollution type sensors, you know it could be harsh environments and so on, but what are other reasons for, because of which the sensors could fail or sensor data could be faulty?

    Akshay Nambi: Great question, so sensors can go bad for numerous reasons, right? This could be due to sensor defect or damage. Think of a soil moisture sensor deployed in agricultural farm being run over by a tractor. Or it could be sensor drift due to wear and tear of sensing components, sensor calibration, human error and also environmental factors, like dust and humidity. And the challenge is, in all these cases, right, the sensors do not stop sending data but still continues to keep sending some data which is garbage or dirty, right? And the key challenge is it is nontrivial to detect if a remote sensor is working or faulty because of the following reasons. First a faulty sensor can mimic a non-faulty sensor data which is very hard to now distinguish. Second, to detect sensor faults, you can use sensor redundancy which becomes very expensive. Third, the cost and logistics to send a technician to figure out the fault is expensive and also very cumbersome. Finally, time series algorithms like anomaly detectors are not reliable because an anomaly need not imply it's a faulty data.

    Sridhar Vedantham: So, a quick question on one of the things that you've said. When you're talking about sensor redundancy, this just means that deploy multiple sensors, so if one fails then you use the other one or do you use data from the other one. Is that what that means?

    Akshay Nambi: Yeah, so sensor redundancy can be looked at both the ways. When one fails you could use the other, but it also be used to take the majority voting of multiple sensors in the same location. Going back to my air pollution example, if multiple sensors are giving very high values right, then you have high confidence in the data as opposed to thinking that is a faulty data. So that's how sensory redundancy is typically used today.

    Sridhar Vedantham: OK, and there you just have to take on faith that the data that you're getting from multiple sensors is actually valid.

    Akshay Nambi: Exactly, exactly. You never know that if all of them could have undergone the same fault.

    Sridhar Vedantham: Right.

    Ajay Manchepalli: It's interesting that when we think about how the industry tries to figure out if the sensors are working or not. There are three distinct approaches that we always observe, right? One is you have the sense of working, but you also try to use additional surrounding data. For example, let's say it's raining heavily, but your moisture sensor is indicating that the moisture level is low. That data doesn't align, right. The weather data indicates there's rains, but the moisture sensor is not giving you the right reading, so that's one way people can identify it's working or not. The other is what we just talked about, which is sensor redundancy- just to increase the number of sensors in that area and try to poll among a bunch of sensors. That also makes sense. And the third one is what typically you really can trust and that is you deploy someone out there physically, go look at the sensor and then have it tested. And if you start thinking about the scenarios we are talking about, which is remote locations, far away locations- imagine deploying sensors across the country, having to send people out and validate and things like that. There is cost associated to sending people as well as you have that sort of a down time, and so being able to, you know, remotely and reliably be able to say that the sensor is at fault, is an extremely empowering scenario. And as we look at this, it's not just sensor reliability, right? For example, if you think of a telephone, a landline, right, you have the dial tone which tells you if the phone is working or not, right? Similarly, we are trying to use certain characteristics in these sensors, that tells us if it's working or not. But the beauty of this solution is it's not just limited to being a dial tone for sensors, it is more than that. It not only tells you whether it is working or not, it can tell you if it is the sensor you intended to deploy.

    I mean, think of it this way. A company could work with a vendor and procure certain class of sensors and they have an agreement for that. And when these sensors are deployed, the actual sensors that get deployed may or may not be in that class of devices, intentionally or unintentionally, right? How do you know that? If we understand the nature of the sensor, we can actually remotely identify the type of sensor that is deployed and help industries essentially figure out whether the sensor that's deployed is the sensor you intended to. So it's more than just whether the sensor is working, you can identify it, you can even figure out things like data drift. This is a pretty powerful scenario that we are going after.

    Sridhar Vedantham: Right, and that's a lovely teaser for me to ask my next question. What exactly are you guys talking about and how do you do this?

    Akshay Nambi: Right, so our key value proposition right is basically a simple and easy way to remotely measure and observe the health of the sensor. The core technology behind this value proposition is the ability to automatically generate a fingerprint. When I say a fingerprint, what I'm referring to is the unique electrical characteristic, exhibited by these sensors, both analog and digital. Let me give an example. So, think of analog sensors which produce continuous output signal proportional to the quantity being measured. Our key insight here is that a sensor’s voltage response right after powering down exhibits a unique characteristic, what we refer to as fall curve? This fall curve is dependent upon the sensor circuitry and the parasitic elements present in the sensor, thereby making it unique for each sensor type. So think of this as basically as fall curve acts as a reference signature for the sensor when it is working. And when the sensor goes bad, this fall curve drastically changes, and now by just comparing this fingerprint, we can tell whether a sensor is working or faulty.

    Ajay Manchepalli: The interesting part about the fingerprint that Akshay just mentioned is that it is all related to the physical characteristics of the sensors, right? You have a bunch of capacitors, resistors, all of those things put together to build the actual sensor device. And each manufacturer or each sensor type or each scenario would have a different circuit and because of that, when you power down this, because of its physical characteristics, you see different signatures. So this is a unique way of being able to identify not just what type of sensor, but even based on the manufacturer, because the circuitry for that particular manufacturer will be different.

    Sridhar Vedantham: So, just to clarify, when you're saying that sensors have unique fingerprints, are you talking about particular model of a sensor or a particular class of a sensor or a particular type of sensor?

    Akshay Nambi: Right, great question again. So, these fingerprints are unique for that particular type of sensors. For example, take soil moisture sensor from Seed Studio, for that particular type of sensor from that manufacturer, this signature remains the same. So all you have to do is for that manufacturer and for that sensor type you collect the fingerprint once. And then you can use that to compare against the operational fingerprints. Similarly, in case of digital sensors we use current drawn as a reference fingerprint to detect whether the sensor is working or not, and the key hypothesis here behind these fingerprints, is that when a sensor accumulates damage, we believe its physical properties also change, leading to a distinct current profile compared to that of a working sensor? And that's the key property behind developing these fingerprints and one of the key aspects of these fingerprints is also that this is unaffected by the external factors like environmental changes like temperature, humidity, right. So these fingerprints are unique for each sensor type and also are independent of the environmental changes. In that way, once you collect a fingerprint that should hold good irrespective of your scenario where you are deploying the sensor.

    Ajay Manchepalli: One other thing that I want to call out there is the beauty of this electrical signatures is based on the physical characteristics, right? So, it's not only when this device fails that the physical characteristics changes, and hence the signature changes, but also the beauty of this is that over time, when things degrade, that implies that the physical characteristics of that sensor or the device is also degrading, and when that happens, your electrical signature also shows that kind of degradation, and that is very powerful, because now you can actually identify or track the type of data drift that people are having. And when you observe such data drift, you can have calibration mechanisms to kind of recalibrate the data that you're getting and continue to function while you deploy people out and get it rectified, and things like that. So, it almost gives you the ability to have a planned downtime because you're not only seeing that the sensor has failed, but you are observing that the sensor will potentially fail down the line and you can take corrective actions.

    Sridhar Vedantham: Right, so basically you're getting a heads up that something bad is going to happen with the sensor.

    Ajay Manchepalli: Exactly.

    Sridhar Vedantham: Great. And have you guys actually deployed this out in the field in real world scenarios and so on to figure out whether it works or not?

    Akshay Nambi: Yeah, so this technology is already deployed in hundreds of devices in the space of agricultural farms, water monitoring and air pollution monitoring. To give you a concrete example, we are working with a company called Respirer who is using dependable IoT technology to provide reliable high fidelity pollution data to its customers and also policymakers. So, for example, Respirer today is able to provide for every data point what they measure, they are able to provide the status of the sensor, whether a sensor is working or faulty. This way users can filter out faulty or drifted data before consuming them. This has significantly increased the credibility of such low-cost sensors and the data that it is generating. And the key novelty to highlight again here is that we do this without any human intervention or redundancy. And in fact, if you think about it, we are not even looking at the sensor data. We are looking at these electrical characteristics, which is completely orthogonal to data, to determine whether the sensor is working, faulty, or drifted.

    Ajay Manchepalli: The interesting part of this work is that we observed in multiple real-world scenarios that there was a real need for reliability of such sensors, and it was really impacting their function. For example, there is a team that's working on smart agriculture, and the project is called FarmBeats. And in that case, we observed that they had these sensors deployed out in the fields and out there in the farms, you have harsh conditions, and sensors could easily get damaged, and they had to actually deploy people to go and figure out what the issue is. And it became very evident and clear that it was important for us to be able to solve that challenge of helping them figure out if the sensor is working or not, and the ability to do that remotely. So that that was sort of the beginning and maybe Akshay, you can talk about the other two projects that led after that.

    Akshay Nambi: Right, so another example is Respirer who is using dependable IoT technology to provide reliable, high-fidelity pollution data to its customers and policymakers. So they are now measuring the sensor status for every time they measure pollution data to determine whether the data which was measured, as from a working or a faulty or a drifted sensor. This way the users can filter out faulty or drifted data before they consume them. And this has significantly increased the credibility of low-cost sensors and the data it is measuring. To give another example, we're also working with Microsoft for Startups and Accenture for a particular NGO called. Jaljeevika, which focus on improving livelihood of small-scale fish farmers. They have a IoT device that monitors temperature, TDS, pH of water bodies to provide advisories for fish farmers. Again, since these sensors are deployed in remote locations and farmers are relying on this data and the advice being generated, it is very critical to collect reliable data. And today Jaljeevika is using dependable IoT technology to ensure the advices generated is based on reliable IoT data.

    [Music]

    Sridhar Vedantham: Right, so this is quite inspiring, that you've actually managed to deploy these things in, you know, real life scenarios and it's already giving benefits to the people that you're working with. You know, what always interests me with research, especially when you have research that’s deployed in the field- is there anything that came out of this that surprised you in terms of learning, in terms of outcome of the experiments that you conducted?

    Akshay Nambi: Yeah, so I can give you one concrete learning, right, going back to air pollution sensors, so we have heard partners identifying these sensors going bad within just few weeks of deployment. And today they have no way to figure out what was wrong with these sensors. Using out technology, in many cases they were able to pinpoint, yes, these are faulty sensor which needed replacement right? And there was also another interesting scenario where the sensor is working well- it's just that because of dust, the sensor was showing wrong data. And we were able to diagnose that and inform the partner that all you have to do is just clean the sensor, which should bring back to the normal state as opposed to discarding that. So that was a great learning in the field what we had.

    Ajay Manchepalli: The interesting thing that we observed in all these scenarios is how the entire industry is trusting data, and using this data to make business decisions, and they don't have a reliable way to say whether the data is valid or not. That was mind boggling. You're calling data as the new oil, we are deploying these things, and we're collecting the data and making business decisions, and you're not even sure if that data that you've made your decision on is valid. To us it came as a surprise that there wasn't enough already done to solve these challenges and that in some sense was the inspiration to go figure out what it is that we can do to empower these people, because at the end of the day, your decision is only as good as the data.

    Sridhar Vedantham: Right. So, you know, one thing that I ask all my guests on the podcast is, you know, the kind of work that you guys do and you're talking about is truly phenomenal. And is there any way for people outside of Microsoft Research or Microsoft to actually be able to use the research that you guys have done and to be able to deploy it themselves?

    Akshay Nambi: Yeah. Yeah. So all our work right is in public domain. So we have published numerous top conference papers in the areas of IoT and sensors. And all of these are easily accessible from our project page aka.ms/dependableIoT. And in fact, recently we also made our software code available through a SDK on GitHub, which we call as Verified Telemetry. So today IoT developers can now seamlessly integrate this SDK into their IoT device and get sensor status readily. We have also provided multiple samples on how do you integrate with the device, how do you use a solution sample and so on. So if you are interested, please visit aka.ms/verifiedtelemetry to access our code.

    Sridhar Vedantham: Right, and it's also very nice when a research project name clearly and concisely says what it is all about. Verified Telemetry- it's a good name.

    Akshay Nambi: Thank you.

    Sridhar Vedantham: All right, so we're kind of coming to the end of the podcast. But before we, you know, kind of wind this thing up- what are you looking at in terms of future work? I mean, where do you go with this?

    Akshay Nambi: So, till now we are mostly focused on some specific scenarios in environmental monitoring and so on, right? So, one area which we are deeply thinking is towards autonomous and safety critical systems. Imagine a faulty sensor in a self-driving vehicle or an autonomous drone, right? Or in an automated factory floor, where data from these sensors are used to take decisions without human in the loop. In such cases, bad data leads to catastrophic decisions. And recently we have explored one such safety critical sensor, which is smoke detectors. And as we all know, smoke detectors are being deployed in numerous scenarios right from hospitals to shopping malls to buildings, and the key question which we went after, right, is how do you know if your smoke detector is working or not, right? To address this, what today people do is, especially in hospitals, they do a manual routine maintenance check where a person uses a aerosol can, to trigger the smoke alarm and then turn them off in the back end.

    Sridhar Vedantham: OK, that does not sound very efficient.

    Akshay Nambi: Exactly, and it's also a very laborious process and significantly limits the frequency of testing? And the key challenge, unlike other sensors here, is you cannot notice failures until unless there is a fire event or smoke.

    Sridhar Vedantham: Right.

    Akshay Nambi: Thus it is very imperative to know whether your detector is working or not in a non-smoke condition. We have again developed a novel fingerprint which can do this and this way we can detect if a sensor is working or faulty even before a fire event occurred and alert the operators in a timely manner. So for those who are interested to understand and curious of how would you do that, please visit our webpage and access the manuscript.

    Sridhar Vedantham: Yeah, so I will add links to the web page as well as to the GitHub repository in the transcript of this podcast.

    Akshay Nambi: Thank you.

    Sridhar Vedantham: Ajay, was there something you wanted to add to that?

    Ajay Manchepalli: Yeah, in all our early deployments that we have made, we have seen that sensor fault is one of the primary issues that comes in play and that's what this has been addressing. But there are many other scenarios that come up that are very relevant and can empower the scenarios even further and that is things like, when you have the data drift or when you observe that the sensors are not connected correctly to the devices and so is some sort of sensor identification. These are some of the things that we can extend on top of what we already have. And while they are incremental changes in terms of the capability, the impact and the potential it can have for those scenarios is tremendous. And that's what keeps it exciting is that all the work that we are doing is driven by the actual needs that we are seeing out there in the field.

    Sridhar Vedantham: Excellent work. And Akshay and Ajay, thank you so much once again for your time.

    Akshay Nambi: Thank you Sridhar. Great having this conversation with you.

    Ajay Manchepalli: Yep, thanks Sridhar. This is exciting work, and we can't wait to do more and share more with the world.

    [Music]

  • Microsoft Research India Podcast – Podcast (10)
    Research @Microsoft Research India: interdisciplinary and impactful. With Dr. Sriram Rajamani19 apr 2021· Microsoft Research India Podcast

    Episode 008 | April 20, 2021

    Microsoft Research India is constantly exploring how research can enable new technologies that positively impact the lives of people while also opening new frontiers in computer science and technology itself. In this podcast we speak to Dr. Sriram Rajamani, distinguished scientist and Managing Director of the Microsoft Research India Lab. We talk about some of the projects in the lab that are making fundamental changes to the computing at Internet scale, computing at the edge and the role he thinks technology should play in the future to ensure digital fairness and inclusion. Sriram also talks to us about a variety of things his own journey as a researcher, how the lab has changed from the time he joined it years ago, and his vision for the lab.

    Sriram’s research interests are in designing, building and analyzing computer systems in a principled manner. Over the years he has worked on various topics including Hardware and Software Verification, Type Systems, Language Design, Distributed Systems, Security and Privacy. His current research interest is in combining Program Synthesis and Machine Learning.

    Together with Tom Ball, he was awarded the CAV 2011 Award for “contributions to software model checking, specifically the development of the SLAM/SDV software model checker that successfully demonstrated computer-aided verification techniques on real programs.” Sriram was elected ACM Fellow in 2015 for contributions to software analysis and defect detection, and Fellow of Indian National Academy of Engineering in 2016.

    Sriram was general chair for POPL 2015 in India, and was program Co-Chair for CAV 2005. He co-founded the Mysore Park Series, and the ISEC conference series in India. He serves on the CACM editorial board as co-chair for special regional sections, to bring computing innovations from around the world to CACM.

    Sriram has a PhD from UC Berkeley, MS from University of Virginia and BEng from College of Engineering, Guindy, all with specialization in Computer Science. In 2020, he was named as a Distinguished Alumnus by College of Engineering, Guindy.

    For more information about the Microsoft Research India click here.

    Related

    Microsoft Research India Podcast: More podcasts from MSR IndiaiTunes: Subscribe and listen to new podcasts on iTunesAndroidRSS FeedSpotifyGoogle PodcastsEmail

    Transcript

    Sriram Rajamani: We are not like an ivory tower lab. You know we are not a lab that just writes papers. We are a lab that has our hands and feet, dirty, we sort of get ourselves dirty sort of get in there, you know, we test our assumptions, see whether it works, learn from them and in that sense actually the problems that we work on are a lot more real than a purely academic environment.

    [Music]

    Sridhar Vedantham: Welcome to the Microsoft Research India podcast, where we explore cutting-edge research that’s impacting technology and society. I’m your host, Sridhar Vedantham.

    [Music]

    Sridhar Vedantham: Microsoft Research India is constantly exploring how research can enable new technologies that positively impact the lives of people while also opening new frontiers in computer science and technology itself. In this podcast we speak to Dr. Sriram Rajamani, distinguished scientist and Managing Director of the Microsoft Research India Lab. We talk about some of the projects in the lab that are making fundamental changes to computing at Internet scale, computing at the edge and the role he thinks technology should play in the future to ensure digital fairness and inclusion. Sriram also talks to us about a variety of things his own journey as a researcher, how the lab has changed from the time he joined it many years ago and his vision for the lab.

    Sridhar Vedantham: So today we have a very special guest on the podcast, and he is none other than Dr. Sriram Rajamani, who is the Managing Director of the Microsoft Research Lab in India. So Sriram welcome to the podcast.

    Sriram Rajamani: Yeah, thank you. Thank you for having me here, Sridhar.

    Sridhar Vedantham: OK, you've been around in Microsoft Research for quite a while, right? Can you give me a brief background as to how you joined and when you join and what's your journey been in MSR so far?

    Sriram Rajamani: Yeah, so I joined in 1999. And , oh man, it's now 22 years, I guess. I've been here for a while.

    Sridhar Vedantham: That's a long time.

    Sriram Rajamani: I joined in Microsoft Research in Redmond right after I finished my PhD in Berkeley and then I, you know, my PhD was in formal verification. So, my initial work in Microsoft in Redmond was in the area of formal verification and then at some point I moved to India around 2006 or something like that. So I think I spent about six or seven years in Redmond and my remaining time- another 15 years- in India. So that's been my journey, yeah.

    Sridhar Vedantham: OK, so this is interesting, right, because, you know, we constantly hear about India as being this great talent pool for software engineers, but we certainly don't hear as often that it is a great place for a computer science research lab. Why do you think a Microsoft Research lab in India works and what drew you to the lab here?

    Sriram Rajamani: I'm a scientist and I joined MSR because I wanted to do high quality science work that is also applicable in the real world, you know. That's why I joined MSR and the reason why I moved to India was because at some point. I just wanted to live here - I wanted to live here because I have family here and so on and then Anandan started the lab and so somehow things came together, and that's why I personally moved. But if you ask, you know, ask me why it makes sense for MSR to have a lab here, the reasons are quite clear.

    I think we are such a big country, we have enormous talent. I think talent is the number one reason I think we are here. Particularly unique to India is that we have really strong undergraduate talent, which is why we have programs like our Research Fellow program. But over the past, many years, right, the PhD talent is also getting better and better. As you know, initially when we started, you know, we recruited many PHDs you know from abroad, who had their PhD from abroad and then return just like me. But over the years we've also recruited many PhDs from Indian institutions as well.

    So, I think that talent is the number one reason.

    The second reason is you know the local tech ecosystem is very different. It started out as a service industry for the West- you know essentially all of the software we were doing, we were servicing companies in the western hemisphere. But over time, India has also become a local consumer of technology, right? Now, be it if you sort of think about, you know Ola or Flipkart, you know, the country is now using technology for its own local purposes. And because of the size and scale of the country, the amount the government and industry is pushing digitization, there's a huge opportunity there as well.

    And finally, I would say another reason to have a lab is in a place like India that it's a very unique testbed. You know, cost is a huge concern in a place like India, technology has to be really low cost for it to be adopted here. There are very severe resource constraints. Be it bandwidth…you know if you think about NLP, you know many of our languages don't have data resources. Very unreliable infrastructure- things fail all the time, and so you know, I've heard of saying that you know if you build something so that it works in India, it works anywhere. So it's a test bed to actually build something.

    If you can deploy it and make it work here, you can make it work anywhere. So in that sense actually it's also another reason.

    Sridhar Vedantham: OK, so basically it works here it's a good certification that it'll work anywhere in the world.

    Sriram Rajamani: Yeah, yeah.

    Sridhar Vedantham: All right. OK Sriram, so here's something I'm very curious about. How does a research scientist end up becoming the managing director of a lab?

    Sriram Rajamani:

    So the short answer is that it was rather unplanned, but maybe I can give a more longer answer. You know, I started out, you know, being a researcher like anyone else who joins MSR. My initial projects were all in the area of, you know, formal verification, you know, I built together with Tom Ball something called static driver verifier that used formal methods to improve windows reliability. Then I worked on verifiable design- how can you do better design so that you produce better systems?

    Then I worked on, you know, security, and now I work on machine learning and program synthesis. And you know, a common thread in my work has always been the use of programming languages and formal methods to sort of understand how to build various kinds of systems be it drivers, be it secure systems, be it machine learning systems. That has been sort of the theme underlying my research. But to answer your question as to how I sort of became lab director, you know, after some years after I moved back to MSR India, you know Anandan who was the lab director then, you know, he left. There was a leadership churn there, and at the time I was asked whether I would consider being the lab director. The first time I declined and because I had many other technical projects that are going on. But I got the opportunity the second time, you know, when Chandu and Jeanette really encouraged me when Chandu decided to move on. I had been in MSR maybe 15-16 years when that event happened. And one of the reasons why I decided to take this up was I felt very strongly for MSR, and I thought that MSR has given me a lot and I wanted to give back to MSR and MSR India.

    And MSR India is easily one of the best CS, computer science industrial labs in this part of the world.

    And, you know, it made sense that I actually devote my time to support my colleagues, grow their lab in ambition, impact and I sort of had a sense of purpose in that, and so I decided to take this on. So the real answer your question is I don't think anyone plans to be a lab director, and sometimes you know you get an opportunity to become one, and sometimes you say yes.

    Sridhar Vedantham: Great. OK, and you know, given that you've been in the lab here in India for quite a while, how do you see the lab having evolved over the years? I mean, I'm sure there are lots of things that have changed quite a bit. So, what do you think are those things that have changed quite a bit and what's not changed and what are the kind of things that you'd like to preserve going forward?

    Sriram Rajamani: Yeah, I think the number one thing that has not changed is quality. MSR… I've now been here for now 21 years and I've been with MSR India you know, from the very beginning. I came here after 6-7 months after the lab started. We've always had great people and the quality of the work we do has always been exceptional. But I think what has changed over the years is that we think much more end to end. When I joined, you know, ’99, we were sort of more academic in nature. We always used to publish in high quality conferences, which we still do. But I think what we do more now is that we think much more end to end. We are no longer satisfied with solving a particular piece of a problem, but we sort of think about how does that piece connect with many, many other pieces, some social, some technical and how do those things fit together broadly to solve problems end to end. We sort of think about that a lot more. As a result, I think we more often than not deploy what we build, you know, either in scale, solutions that actually are adopted by product groups, or actually in our communities to actually validate whether what we think about as something that creates a change, does it indeed create a change and learn from that and use that to even reframe our problems, test our assumptions. And so, you know, I don't think we are, we are not like an ivory tower lab. You know we are not a lab that just writes papers. We just we are a lab that has our hands and feet, dirty, we sort of get ourselves dirty, sort of get in there, you know, we test our assumptions, see whether it works, learn from them and in that sense actually the problems that we work on are a lot more real than a purely academic environment.

    I think that that's the way in which things have changed. And I think partly also, Sridhar, as you do that,

    we have become a lot more interdisciplinary…

    Sridhar Vedantham: Right.

    Sriram Rajamani: …you know, if you look at our projects today, right? Because if you want to get something to work end to end, it is not just one piece you build, you know. You have to make it interdisciplinary and many of our projects are interdisciplinary. I think that's the other way in which we’ve changed.

    Sridhar Vedantham: Yeah, in fact this particular term, right, interdisciplinary research- is something that I've heard quite often coming from you. Do you want to just bring up a couple of examples of what you mean by interdisciplinary research through by using some projects as examples?

    Sriram Rajamani: Yeah, I can give like you know, two or three you know. The first one that comes to mind is our EzPC or you know, our multiparty computation project. And if you look at actually how that project is working, the whole goal is to take computations, be it, DNN training or DNN inference and run it securely with multiple parties. And you know that's a pretty complex problem. There’s compiler people, there's programming languages people, and there's cryptographers. All of them work together to build a solution where the programmer can express their computation in some language, and there's a compiler that compiles it. And then there is, you know, a lot of cryptography smarts are there in order to make this you know, multiparty computation work and, and that's very unique. You can't do this without compiler people and the cryptographers getting together.

    Another example, you know, is if you look at our Akupara semantic search work, that's actually a combination of algorithms, work machine learning work and systems work so that we can index trillions of vector indices and look them up right in a reasonable amount of time with a reasonable number of machines. I mean, I can't imagine you doing that without expertise in algorithms, machine learning and systems.

    And if you look at our, more recent, you know societal impact projects like we have this project called HAMS which we are using to improve road safety. I mean that has actually quite a bit of tech like computer vision. You have to make that work on a smartphone. So you need to have systems innovation to actually make that work on a smartphone. And it has quite a bit of HCI. I mean it has to work in an environment where you go into a driver license testing RTO and it should just work there, right? You know it has to work with people, the feedback that is given should be consumable. You know if somebody fails a driving test, the feedback has to be given in such a way that it's a positive experience for them even if they fail the test, right? So it has all these interdisciplinary aspects. And so I hope that those give you a little bit of a flavor for what it takes to solve things end to end.

    [Music]

    Sridhar Vedantham: A lot of the listeners of this podcast are not going to be really familiar with all the stuff that we do here at MSR right, at MSR India especially. In your mind, you know, how do you kind of categorize or bucket the different research work that goes on in the lab?

    Sriram Rajamani: We now think about our work as being classified into themes and the themes are different from our expertise. If you look at our expertise, right, our expertise has always been, you know, from the beginnings of the lab, we have 4 broad areas of expertise.

    We have, you know, people with expertise in algorithms, second in machine learning, and third in systems, very broadly interpreted in including programming languages, distributed systems, networking, security and so on.

    And then we have people who do human computer interaction and social sciences, right?

    Those are our four areas of expertise, but if you look at the way we organize our work now, it is in the themes. We have 5 themes. One theme is around large-scale machine learning, you know, things like recommendation systems, search, large multilingual learning which spans an entire gamut, from algorithms to practical machine learning algorithms as well as systems, right, you know, in order to build them and scale them.

    Then we have two systems related themes. One is data driven systems and networking where we are using telemetry and the enormous amount of data that we get from large scale cloud systems to do machine learning on them and improve those systems themselves. And then the second systems area we have is called co-designed systems, where we think about interdisciplinary systems work that spans distributed systems, security, privacy, programming languages, verifications. So we sort of think about systems much more holistically.

    Another thing we have is edge computing, where we sort of think of about machine learning systems, usability in the edge, which is such an important topic from the perspective of India. And the last theme is socio-technical systems and inclusion where we really think about technology as an ally for an enabler for inclusion and empowerment.

    And each of these five teams, right, draws on the expertise of people from these various disciplines.

    Sridhar Vedantham: Great, so I've heard you many times talking about things like tech at scale, so I think you have a couple of things you know. There are a couple of things that you said that kind of stick in my mind, so there is one tech at scale and then there is one tech in minute form. I forget the exact terms you use and socio-technical computing is also quite big at MSR India right now. Could you give me a flavor of what exactly is happening in, say, the tech at scale area and also the social technical computing area?

    Sriram Rajamani: Yeah, so I think the tech at scale is quite important because digital systems are very pervasive now. The pandemic has only accelerated the adoption of digital systems. Most interactions these days are online, and even when we come back from the pandemic, it's going to be hybrid, right? The amount of information is just increasing and increasing and increasing and as a result, right, for any useful user experience we need to be able to sort through this huge amount of information and make the right information available at the right time, right. And that I think is in some sense the primary goal of AI and machine learning and systems at scale, and I think most of our systems at scale work are about how to build systems that use AI and machine learning to process huge, humongous amounts of information in billions and trillions of pages or documents, or vectors, understand them and make sure that their right information is available to you at the right time. And how do you do that reliably, how do you do that securely, how do you do that while preserving privacy? So that I think is the is the crux of our tech at scale.

    I already mentioned Akupara, which is a trillion-scale index and serving system that we are building for semantic search. Another at-scale project we are doing is called extreme classification where we are trying to build classifiers that can take an object and classify it into hundreds of millions of categories, right? And just like when we think about machine learning, we think about a picture and classifying it into a cat or a dog or a small number of categories. But in extreme classification we take an object like a web page or a document, and we classify it into, potentially, millions or hundreds of millions of, for example, topics. I mean, what are these topics that this document is talking about, or if this object that I'm talking about is an advertisement, what are the keyword bid phrases that are actually relevant to this advertisement, right? So those kinds of classifications are significantly more complex, and our lab really originated this field and is a thought leader in this field.

    Another at-scale work that we are doing is if you take the area of DNN training. Deep neural network training is an extremely resource intensive process. You know if you take, billions and billions of training points and train deep neural networks, that uses a huge number of GPU resources and other hardware resources. Can you do that more efficiently? And we have a project called Gandiva that improves the throughput of all of the infrastructure that we are using to train these kinds of DNNs.

    And we want to give you one more example. We have a project called Sankie and what Sankie does is actually, to use all of this telemetry from huge software engineering processes, including coding, testing, development to really improve the productivity of the engineering itself. So, I would say you know those are the kinds of examples of at-scale AI and ML and Systems project that we do. And I think every CS lab has to do that because that is the real world today.

    Sridhar Vedantham: And we've actually done a podcast earlier on Sankie, so I think I'll link through to that when we publish the transcript of this podcast.

    Sriram Rajamani: Wonderful.

    Sridhar Vedantham: Right, and in sociotechnical computing, do you want to talk a little more about that? And this is something that personally I find quite fascinating, you know, because this lab has always had, from the very beginning, a focus on the ICTD space itself. But the kind of projects that are happening now in the lab seem to be taking that to a different level altogether in terms of actually going out there and figuring out the impact and deploying at scale.

    Sriram Rajamani: You're right, Sridhar, that Technology for Emerging Markets has always been a really interesting area in the lab from the inception. But one thing that has changed if you sort of think about it, is that...see, when the lab was started, right, the kind of technology that was available to everybody in rural India was very different from the technology that all of us use, right? You know they had maybe feature phones and everybody else, you know, had smartphones and so on. But now connectivity, smartphone penetration and so on has increased significantly, right? So in some sense, I think the smartphone and through 4G and so on, connectivity to the cloud, the cloud and the mobile, and with the smartphone is much more accessible, much more prevalent these days, right? But still the problems are there, you know, bandwidth is a problem, you know, they don't work in local language, English works much better than local language. Those constraints are there, but the technology platform has up leveled throughout the country.

    So as a result, right, if you take our own work on socio-technical computing, we are doing technologically more sophisticated things now than we did before because more technologically sophisticated things are accessible to a much broader population of the country. That I think is the way things have changed, which is why we are actually now able to do projects like you know, HAMS where you're using driver license testing. Because even, uh, even an RTO in a rural area, right, they have access to smartphones, right? And you know, they are interested to see whether you know driver license testing can be streamlined. So I think that the high tide has lifted the technology everywhere. I think that's one way in which things have changed. Another one where we are now using peer to peer connectivity, this project called BlendNet, where we are actually helping people share media and other bulky resources better. And even that actually, you know the reason why we are doing this because you know smartphone, the desire to view movies, entertainment, it’s very widespread throughout the country, right? So that's actually another example of projects.

    And even just this morning I was actually looking at a piece of news where they were talking about, uh, this company Respirer Living Sciences and we're having a collaboration with them to measure air pollution, and they want to monitor pollution and democratize the data, right. I mean this is now such an important problem and but if you look at what is needed to do that right, we have to solve really hard technical problems. Like how do you make sure that the sensors that are sensing these are reliable? If there's a way in which the sensors are calibrated, if it is erroneous, how do you re-calibrate them? But these are hardcore technology problems that I think are important to solve a societal problem like air pollution.

    So another way I think things have changed is that maybe, perhaps, previously all our societal scale problems were sort of low tech- that's no longer true. That doesn't mean actually that the tech works as it is, right?

    You know, we still work on projects like Karya where we are trying to do data collection and crowdsourcing for low resource Indian languages and that requires actually us to build user interfaces that work with semi-literate and illiterate users and you know, make sure that we are actually able to cater to the multilingual population in the country and so on, right? So the user centered design and the need to design to people on the other side of the digital divide is still important, right?

    But you know at the same time the tech tidal wave has also lifted things up, so I think that's sort of the dynamic here, I think.

    Sridhar Vedantham: Right, and there's a bit of a conundrum here, right? Because at one point of time it was assumed that technology itself is going to help people's lives become better and so on. And we've obviously seen technology permeate to levels within society that it's never permeated before. Now this brings about questions of digital inclusion and fairness and equitable access to information and to the benefits of technology. So, a couple of questions here. How do we actually ensure things like digital inclusion and fairness? And given very specific, unprecedented situations like the one that we find ourselves now, in the midst of a pandemic, how does this actually impact people or impact society at large?

    Sriram Rajamani: I think in spite of the fact that digital technology has permeated right, it is very clear that technology is still very non inclusive, right? That is also, I think true at the same time. And so there is no silver bullet I think to the question that you're asking. I think it's extremely important for us to think about as scientists and technologists, think about underserved populations, underserved communities and see whether the technologies that we build you know are inclusive, whether they are useful, you know. I give an example of what Manohar Swaminathan is doing with his work on accessibility, where he has done quite a bit of study in schools for visually impaired children, thinking about even the curriculum that they have you know, in STEM and computing, computational thinking, I think, for this population. And seeing whether the tools that we have and even the curriculum that we have and the technologies that we have, are they actually reaching you know this demographic and the answer is no.

    And then quite a bit of work needs to be done in order to make sure that you know people with vision impairment, children with vision impairment are getting educated in digital technologies and the technology is inclusive. And there's a huge gap there, so his work is particularly inspiring in that sense. And you know, I think problems like awareness and literacy, they are very hard problems to solve. You know you make a smartphone cheaper. You can actually make you know 4G and 5G maybe more available. But you know things like literacy, cognition and understanding of actually what's going on, those take many, many generations to resolve. So I think one has to think about people’s context, you know people preparedness, when thinking about inclusion.

    Sridhar Vedantham: Great, so I'm going to be cognizant of your time. I know you've got a bunch of meetings every day all the time. So before we sign off are there any final thoughts?

    Sriram Rajamani: Yeah, so I would say that I think that the pandemic in some way has really accelerated digital transformation, right? But at the same time the pandemic has also exacerbated the gap between the rich and the poor. That has also happened. So I would say that, I think this is a very interesting time, as scientists and technologists. On the one hand, actually, science is our, you know, an important hope for us to get out of the pandemic, you know, be it vaccines, you know, be it digital technology to help us communicate and collaborate even when we are at our homes, technology is such an important thing to do. And in order to actually serve, you know, the large number of people we have to build technology at scale. I think that's such an important thing. At the same time, I think, you know, that the virus doesn't discriminate between you know, rich or poor. It doesn't discriminate based on race or gender, and so if we have to actually get out of the pandemic, you know, we have to actually make sure that the solutions you know be it vaccines, they reach everyone. If anything, the pandemic has taught us that you know, unless we serve everyone problems like the pandemic and same thing with climate change, those are not going to be solved. Those are universal problems and that by definition they are inclusive, right? So I think my closing comment would be for technologists to think about technologies in such a way, technology in such a way that it brings people together. You know, have empathy for people in every shape, size and form and make sure that what we build, serves the whole of the world.

    Sridhar Vedantham: OK Siram, thank you so much for your time. This has been a fascinating conversation.

    Sriram Rajamani: Yeah, thank you Sridhar, and I wish the listeners health and happiness in the rest of the year as well. Thank you.

    [Music Ends]

  • Microsoft Research India Podcast – Podcast (11)
    Helping young students build a career in research through the MSR India Research Fellow program. With Shruti Rijhwani and Dr. Vivek Seshadri21 dec 2020· Microsoft Research India Podcast

    Episode 007 | December 22, 2020

    One of Microsoft Research India’s goals is to help strengthen the research ecosystem and encourage young students to look at research as a career. But it is not always easy for students to understand what research is all about and how to figure out if research is the right career for them. The Research Fellow program at Microsoft Research India enables bright young students to work on real-world research problems with top notch researchers across the research lifecycle, including ideation, implementation, evaluation, and deployment. Many of the students who have been part of the program have gone on to become researchers, engineers and entrepreneurs.

    Today, we speak to Shruti Rijhwani, a graduate of MSR India’s Research Fellow program who is currently doing her PhD at the Carnegie Mellon University, and joining us is Dr. Vivek Seshadri, a researcher at MSR India who also heads the Research Fellow program at the lab.

    Shruti was a research fellow at MSR India in 2016, working on natural language processing models for code-switched text.

    She is currently PhD student at the Language Technologies Institute at Carnegie Mellon University. Stemming from her work at MSR India, she has continued research in multilingual NLP, with a focus on low-resource and endangered languages.

    Vivek primarily works with the Technology for Emerging Markets group at Microsoft Research India. He received his bachelor’s degree in Computer Science from IIT Madras, and a Ph.D. in Computer Science from Carnegie Mellon University where he worked on problems related to Computer Architecture and Systems. After his Ph.D., Vivek decided to work on problems that directly impact people, particularly in developing economies like India. Vivek is also the Director for the Research Fellow program at MSR India.

    For more information about the Research Fellow program, click here.

    Related

    Microsoft Research India Podcast: More podcasts from MSR IndiaiTunes: Subscribe and listen to new podcasts on iTunesAndroidRSS FeedSpotifyGoogle PodcastsEmail

    Transcript

    Shruti Rijhwani: I think I credit my whole graduate school decision-making process, the application process, and even the way I do research in grad school to my experience as a Research Fellow in MSR India. Of course, the first thing was that I wasn't even sure whether I wanted to go to grad school, but after going through the Research Fellow program and with my amazing mentors and collaborators at MSR India, I took the decision to apply to grad school.

    [Music]

    Sridhar Vedantham: Welcome to the Microsoft Research India podcast, where we explore cutting-edge research that’s impacting technology and society. I’m your host, Sridhar Vedantham.

    [Music]

    One of Microsoft Research India’s goals is to help strengthen the research ecosystem and encourage young students to look at research as a career. But it is not always easy for students to understand what research is all about and how to figure out if research is the right career for them. The Research Fellow program at Microsoft Research India enables bright young students to work on real-world research problems with top notch researchers across the research lifecycle, including ideation, implementation, evaluation, and deployment. Many of the students who have been part of the program have gone on to become researchers, engineers and entrepreneurs.

    Today, we speak to Shruti Rijhwani, a graduate of MSR India’s Research Fellow program who is currently doing her PhD at the Carnegie Mellon University, and joining us is Dr. Vivek Seshadri, a researcher at MSR India who also heads the Research Fellow program at the lab.

    [Music]

    Sridhar Vedantham: OK, so I'm looking forward to this podcast because it's going to be a little different from what we've done in the past, in the sense that this is not a podcast about research projects or technologies, but it's something much more human, and we're going to be talking about the Research Fellow program that we have at MSR India.

    And I'd like to welcome a special guest- Shruti, who used to be a Research Fellow at the lab and also Vivek Seshadri, who is a researcher at the lab and whom we've had on the podcast earlier in a different capacity. But today he's wearing the hat of the czar or the director of the Research Fellow program here.

    So welcome, Shruti and Vivek.

    Vivek Seshadri: Good evening, Sridhar, and very good morning Shruti.

    Shruti Rijhwani: Hi Sridhar and Vivek, it’s great to be here and great to be back to interacting with people from MSR India. It's been about four years since I left the RF program, so I'm really looking forward to talking about it and remembering some of my experiences.

    Sridhar Vedantham: Excellent, so let's lay a little bit of a groundwork before we jump into the whole thing. Vivek can you give us a bit of an overview of what the Research Fellow program is.

    Vivek Seshadri: Sridhar, the Research Fellow program has been around ever since the organization Microsoft Research India started itself. I think initially it was called the Assistant Researcher program and then it was called the Research Assistant Program and right now we're calling it the Research Fellow program. But the core of the program has been to enable recent undergraduate and Master’s students to spend one or two years at MSR India and get a taste for what research in computer science looks like, especially in an industrial setting.

    Sridhar Vedantham: So has the program evolved over time, or have there been any substantive changes or it's still the same at its core and its essence?

    Vivek Seshadri: I think the only thing that has changed significantly is the number of research fellows that we have had. I think in the program started in its first year, I think we had had three assistant researchers, and today as we speak, we have over 50 Research Fellows in the lab working on various projects. So, in that sense, the program has definitely grown in size along with the lab, but I think the core goal of the program has not changed at all. It is still to give Research Fellows a taste of what research looks like in computer science and enable them to build their profile and prepare them for a career in computer science research and engineering.

    Sridhar Vedantham: Right, so one thing that I've seen personally is that the Research Fellows add a huge amount of energy and life to the lab. And on that note, Shruti, what motivated you to come into MSR India to join the Research Fellow program?

    Shruti Rijhwani: That's a great question and something that I think my experience is probably what a lot of Research Fellows who apply and join the program actually go through before deciding to join the program. So, I did an undergrad degree in computer science from BITS PILANI and during those four years I took classes on machine learning, information retrieval and so on, and also did two internships where I got a taste of like how machine learning applications can be applied to products in the real world. But both of those internships were kind of focused on the engineering side and I was really interested in what a career in doing machine learning research or using machine learning for research-based applications would look like. And I knew that if I wanted to pursue a career in this field, I would probably have to go to graduate school to get a Master’s and a PhD, but I wasn't entirely sure whether this is what I wanted to do, so it was kind of like an exploratory phase for me to be in the Research Fellow program. I wanted to get some research experience, I wanted to see what established researchers in computer science do on a daily basis, and what the research process kind of is like when you're working in machine learning, and more specifically in natural language processing, which is what I was interested in.

    Sridhar Vedantham: Right, so, Vivek, what I'm getting from Shruti is that, uh, the Research Fellow program is something that she is looking at to form a base or a basis for a longer career in research itself. So do you have any specific insights or inputs into about what the program actually offers in a structured manner to the Research Fellows that we have?

    Vivek Seshadri: Yeah, so Microsoft Research now at its core is a research organization. All researchers at MSR do research in computer science and they're working on a variety of projects spanning all areas of computer science, you know, from theory, artificial Intelligence, machine learning, systems, security and MSR India also is known for its Technology for Emerging Markets (TEM) Group where we look at, you know, problems specifically affecting developing countries like India. So, essentially, Research Fellows join one of these projects and work with world class researchers on multiple phases of research, including ideations, building solutions, prototyping those solutions, deploying them in the field, working with large real industrial datasets to test out their solutions. So that experience essentially gives the perfect taste of what modern computer science research looks like for Research Fellows, and just like Shruti most of our Research Fellows after their stint at MSR India apply for grad school and you know, go to one of the top grad schools across the world, but there are others who decide, you know, research is not for them. Many of them join Microsoft and continue to work at Microsoft in some, you know some other role. And a few of them actually, you know, have taken the entrepreneurial route. You know, we have had CEOs of many, many big companies, including Ola and some of our own projects which have been converted to startups like Digital Green and Everwell. So, you know, some of them take that route as well. But primarily it's that experience of what computer science research looks like today. I think that’s the essence of what the research program offers students.

    Sridhar Vedantham: Great and uh, in terms of numbers, how many Research Fellows do you think would have graduated so far from MSR India over the years?

    Vivek Seshadri: Like I said, you know, in the initial years it was handful of Research Fellows that we have So overall, if you look at the numbers, it's around 150 Research Fellows in a period of 15 years. But if you look at our run rate in the most recent years, we have been graduating close to 30 Research Fellows every year, and that number is just increasing.

    Sridhar Vedantham: Great, so this is another contribution of MSR to the research field, I guess.

    Vivek Seshadri: Absolutely, yeah. And you know these guys hit the ground running. You know, if they do choose to do a PhD, they hit the ground running. They know how research works. They know the different phases of, you know, research projects, how to go about tackling these problems. So, in that sense I'm assuming all the advisors of past Research Fellows are extremely happy advisors- little less work for them in actually guiding their students.

    Sridhar Vedantham: [Laughing] OK, I think we need to speak to some of the advisors privately to validate that.

    Vivek Seshadri: Yes.

    Sridhar Vedantham: So Vivek, another question for you. You know MSR India has also got this very large internship program. What's the fundamental difference between an internship at MSR and a Research Fellow position?

    Vivek Seshadri: MSR used to have a very vibrant three-month summer internship program and I think over the years what we have realized is that for any stint at MSR to benefit both the student and the research projects at MSR, you need that reasonable amount of time. So, if you see even the internship program, you know we have almost faded out three-month internships. We only offer six-month internships these days. So in that sense, the main difference between the traditional internship program and the Research Fellow program is that duration. You know when a Research Fellow comes in, we know that they're going to spend at least one year with us. In most cases, they are going to spend two years with us, which means the researcher, with confidence, can set a really big goal for the Research Fellow and you know, really have that long shot instead of, a short internship where to have any meaningful output, you know, you cannot have a really large vision or a large goal. So in that sense, if you actually see the contributions that Research Fellows have made to some of our projects, they are extremely substantial, solving some fundamental problems in computer science, impacting, projects in major ways within Microsoft and also having impact on society. These contributions would not be possible in a six-month internship program.

    [Music]

    Sridhar Vedantham: Shruti, when you were doing your Research Fellowship at MSR India, what is the kind of work you were doing and who were your mentors out here?

    Shruti Rijhwani: Right, so as I said, I was really interested in machine learning applications, more specifically as applied to natural language processing problems. So when I was a Research Fellow at MSR India, I was mentored by Monojit Choudhury and also worked closely with Kalika Bali on natural language processing problems. We were focusing on the problem of code switching, which is basically when multilingual people mix languages when they're speaking. we were basically trying to create natural language processing tools to automatically understand such code-mixed language.

    Sridhar Vedantham: And how do you think that helped you? I mean you are at CMU now, which is one of the top tier PhD schools in the world. But how do you think the time that you spent here in MSR India actually helped you get to where you are?

    Shruti Rijhwani: I think I credit my whole graduate school decision-making process, the application process, and even the way I do research in grad school to my experience as a Research Fellow in MSR India. Of course, the first thing was that I wasn't even sure whether I wanted to go to grad school, but after going through the Research Fellow program and with my amazing mentors and collaborators at MSR India, I took the decision to apply to grad school. And as I said, I was working on problems in multilingual natural language processing. And although I focused on code switching at MSR India, it kind of made me very interested in continuing in this field of just trying to be able to automatically process many, many languages at the same time and through my Masters and now my PhD, all of my research has been focused on doing multilingual NLP. So, in a way my experience at MSR sort of shaped my view of NLP research and taught me important research and technical skills, going all the way from like doing a literature survey to collecting data, doing experiments, and finally writing publications. I went through that process for the first time at MSR India, and it's like really helped me go through graduate school as well.

    Sridhar Vedantham: Great, so it sounds like you really worked hard while you were at MSR. Did you actually have some time to do something other than work, to kind of go out, have fun and enjoy yourself?

    Shruti Rijhwani: Yeah, I tried to keep a great work life balance at MSR India and in fact I think MSR and the Research Fellow program in general quite encourages that. We had a strong community of Research Fellows. We were all really good friends and it kind of makes sense because all of us were in the same exploratory phase of doing research for the first time, for most of us. So there's a lot we had in common, and all of us were good friends, enjoyed ourselves outside of work as well. So yeah, I really enjoyed my time at MSR.

    Sridhar Vedantham: I believe MSR India's also got this fantastic annual offsite which is a great deal of fun.

    Shruti Rijhwani: Definitely. I was a Research Fellow for one year and the offsite that year was really good fun. It's good to interact with people outside of work as well and learn about the people you collaborate with as people, not just colleagues. So I really enjoyed the MSR offsite and as Research Fellows, we also would often have our own outings like we would go out in the city and explore the city and so on. So it was - really fun. I really appreciated the community of Research Fellows that we had.

    Sridhar Vedantham: Super. Vivek, a question for you. What does the Research Fellow selection process look like? How do you actually go about it, and is there any particular way in which you look at matching candidates to mentors?

    Vivek Seshadri: Absolutely. I think in in many ways the Research Fellow selection process is similar to how grad school applications work. how universities select PhD students and there are multiple projects going on inside MSR India in different areas. Applicants have specific interests. You know, these days, even in the four years of undergrad that they go through, people develop interest in various areas, so they come in with their own interests. So our goal is to essentially identify the best candidates for each project. And like I mentioned, you know, it's not only the case that the number of Research Fellows slots have increased over the years from three in the first year to something close to 40 to 50 right now. It's also that the number of applicants has increased significantly. We receive close to 1000 applications and in some sense you know it’s a long review process where we ensure that each applicant at least gets one pair of eyes looking at the application and then determining whether there are projects within MSR India that suit that candidate, . we come up with extensive rankings and finally researchers go through that pool of applicants that we have ranked and figure out who the best candidate for their project is.

    Sridhar Vedantham: So, Shruti, if I were to ask you what the main takeaway is for you from the Research Fellow program, what would you say?

    Shruti Rijhwani: Well, there are a whole bunch, I really gained so much from this program. As I already said, you know it shaped my view of research and really gave me a good footing to start my journey through Graduate School with my Master’s and now my PhD. I would really recommend this program to anyone who is looking to get a little bit of experience in computer science research and if anyone wants to kind of explore this as a career option, I think this is a really good first step. Particularly because there are researchers in MSR India working on such broad fields in computer science. And just attending talks and having conversations about their research can really expand your view on computer science research and give you a really strong place to set your own career off in whatever field you're interested in. So I think that's my main takeaway. I really enjoy the community feeling in MSR India among the RFs as well as with the senior researchers in the lab. And just the fact that it helped me so much in Graduate School. I definitely recommend the program to anyone who has an inkling of interest in computer science research.

    Sridhar Vedantham: That's fantastic and I know for a fact that the people who are involved with the Research Fellow program and in general people at the lab really feel proud when, you know, people like you go through the program and then join some of the top organizations in the world. So Vivek, how do students apply for an RF position? And what's the kind of exposure that they can look at? I mean, I know we've been speaking to Shruti, who you know, kind of was focusing on one particular area, but in a more general sense, can you talk a bit about what students can expect when they come for a Research Fellow position here?

    Vivek Seshadri: Sridhar, if you look at computer science as a field in the past few years, it has become increasingly interdisciplinary. And Microsoft Research India being a flat organization, enables researchers for experts in different areas to sort of collaborate with each other and, you know, solve big problems that require such collaboration. In fact, in the past few years, if you have noticed we have had major impact in multiple problems where researchers from machine learning, systems, compilers and HCI have come together and offered big solutions to important problems. So in that sense, given where computer science is today and where it is going, I think MSR India is sort of an ideal place for new budding researchers to come and gain experience in what that interdisciplinary research looks like.

    In fact, as an RF, if I mean just like Shruti mentioned, even though they are working on a specific problem in a specific area, they will often not only have the opportunity to listen to people working in other areas through talks and lectures and whatnot, they may also have to actively collaborate with other Research Fellows working in these other areas. So that experience of interdisciplinary research is sort of essential for any budding scientists in computer science. So I think that's one of the main reasons why I feel MSR India is a great place for students to come and test what research in computer science looks like today.

    Sridhar Vedantham: And what's the process for somebody applying for the RF position?

    Vivek Seshadri: Yeah, like I mentioned, the application process is very similar to, you know, a grad school application. You know we do require students to upload their resume and a statement of purpose. There's a portal. People can just search for the Research Fellow program at MSR. In the Internet it will lead them directly to our page from where they can apply. There's a single deadline this time on January 15th (2021), which is right after you know grad school applications. So if students are applying to grad school, they already have all the material to apply for the Research Fellow program.

    Sridhar Vedantham: Great, so what I'll do is add links to the Research Fellow program. Will attach it to the podcast so that the listeners can actually go and check it out and see what the Research Fellow program offers.

    And Shruti and Vivek. Thank you so much for your time. I know that we're doing this in multiple time zones, so thank you so much for being accommodative. And it's been a great conversation.

    Vivek Seshadri: Likewise, Sridhar, thanks a lot.

    Shruti Rijhwani: Thanks, I had a great time.

    Sridhar Vedantham: Thank you, stay safe everybody.

  • Microsoft Research India Podcast – Podcast (12)
    Evaluating and validating research that aspires to societal impact in real world scenarios. With Tanuja Ganu19 okt 2020· Microsoft Research India Podcast

    Episode 006 | October 20, 2020

    At Microsoft Research India, research focused on societal impact is typically a very interdisciplinary exercise that pulls together social scientists, technology experts and designers. But how does one evaluate or validate the actual impact of research in the real world? Today, we talk to Tanuja Ganu who manages the Societal Impact through Cloud and AI (or SCAI) group in MSR India. SCAI focuses on deploying research findings at scale in the real world to validate them, often working with a wide variety of collaborators including academia, social enterprises and startups.

    Tanuja is a Research SDE Manager at Microsoft Research, India. She is currently part of MSR’s new center for Societal impact through Cloud and Artificial Intelligence (SCAI).

    Prior to joining MSR, she was a Co-Founder and CTO of DataGlen Technologies, a B2B startup that focuses on AI for renewable energy and sustainability technologies. Prior to this, she has worked as Research Engineer at IBM Research, India.

    Tanuja has completed MS in Computer Science (Machine Learning) from Indian Institute of Science (IISc, Bangalore). She has been recognized as MIT Technology Review’s Innovator Under 35 (MIT TR 35) in 2014 and IEEE Bangalore Woman Technologist of the Year in 2018.  Her work was covered by top technical media (IEEE Spectrum, MIT Technology Review, CISCO Women Rock IT TV series, IBM Research blog and Innovation 26X26: 26 innovations by 26 IBM women).

    Click here to go to the SCAI website.

    RelatedMicrosoft Research India Podcast: More podcasts from MSR India iTunes: Subscribe and listen to new podcasts on iTunesAndroid RSS Feed Spotify Google Podcasts Email Transcript

    Tanuja Ganu: As the name suggests, SCAI, that is Societal Impact through Cloud and Artificial Intelligence, it is an incubation platform within MSR for us to ideate on such research ideas, work with our collaborators like academia, NGOs, social enterprises, startups, and to test or validate our hypothesis through very well defined real world deployments. At SCAI, it's an interdisciplinary team of social scientists, computer scientists, software engineers, designers, and program managers from the lab who come together for creating, nurturing and evaluating our research ideas through real world deployments and validations.

    [Music]

    Sridhar: Welcome to the Microsoft Research India podcast, where we explore cutting-edge research that’s impacting technology and society. I’m your host, Sridhar Vedantham.

    [Music]

    At Microsoft Research India, research focused on societal impact is typically a very interdisciplinary exercise that pulls together social scientists, technology experts and designers. But how does one evaluate or validate the actual impact of research in the real world? Today, we talk to Tanuja Ganu who manages the Societal Impact through Cloud and AI (or SCAI) group in MSR India. SCAI focuses on deploying research findings at scale in the real world to validate them, often working with a wide variety of collaborators including academia, social enterprises and startups.

    Tanuja has been recognized as one of MIT Technology Review’s Innovators Under 35 (MIT TR 35) in 2014 and by IEEE Bangalore as a Woman Technologist of the Year in 2018, and her work has been covered by top technical media.

    [Music]

    Sridhar Vedantham: Tanuja, welcome to the podcast. I'm really looking forward to this particular edition of what we do here. Because, I know that you manage SCAI and it's quite an intriguing part of the lab. Now before we get into that, tell us a little bit about yourself.

    Tanuja Ganu: First of all, thanks Sridhar for having me on the podcast today. And uh, yes, uh, I'm not a full-time researcher, but I'm engineer by training and I have done my Master’s in Computer Science. Over the last decade or so, my work is primarily at the intersection of research and engineering, and it's on the applied research side. So throughout my experience and journey, working at research labs and start up, I'm very much interested in taking a research idea through the entire incubation phase to validate its applicability in real world problem settings.

    Sridhar Vedantham: So, Tanuja, I know you manage this thing called SCAI within the lab and I think it's a very interesting part of the lab. Talk to us a little bit about that, and especially expand upon what SCAI- the term SCAI- itself stands for, because I myself keep tripping up on it whenever I try to explain it.

    Tanuja Ganu: Yes, Sridhar. So since the inception of our lab, the lab has been doing very interesting work in the societal impact space. Additionally, with the advances in artificial intelligence and cloud-based technologies in recent years there are increased opportunities to address some of these societal problems through technology and amplify its positive effect. So as the name suggests, SCAI, that is Societal Impact through Cloud and Artificial Intelligence, it is an incubation platform within MSR for us to ideate on such research ideas, work with our collaborators like academia, NGOs, social enterprises, startups, and to test or validate our hypothesis through very well defined real world deployments. Also our location in India allows us to witness and carefully analyze various socio-economic challenges. So the solutions that we ideate are inspired by Indian settings and in many cases equally applicable to different parts of the world.

    Sridhar Vedantham: Interesting, so it sounds like there's a fair amount of difference between the kind of work that SCAI does and between what the rest of the lab actually does in terms of research.

    Tanuja Ganu: So at MSR India, where research work is mainly along three different axes, firstly advancing the state of the art in science and technology, second is inspiring the direction for technology advances, and the third important axis is building the technology for driving societal impact. So SCAI is primarily focused on social impact access and many of our projects also do have very strong academic and technological impact. At SCAI, it's an interdisciplinary team of social scientists, computer scientists, software engineers, designers, and program managers from the lab who come together for creating, nurturing and evaluating our research ideas through real world deployments and validations. So that's really the difference in terms of the other type of research that we do at lab and what we do at SCAI.

    Sridhar Vedantham: So when you decide to take up a project or accept it under the SCAI umbrella, what do you actually look for?

    Tanuja Ganu: Yeah, we look for a few things for defining a SCAI project. So firstly, it should address a significant real-world problem and should have a potential to scale. The second thing is the problem should offer interesting research challenges for our team. The next thing is whether we have credible partners or collaborators with domain expertise to deploy, evaluate and validate of our research. We also look for how we can define rigorous impact evaluation plan for a project. And lastly, we look for what are the feasible graduation paths for the project within two to three years of time horizon.

    Sridhar Vedantham: What do you mean by graduation?

    Tanuja Ganu: So, um, there are different ways in which a particular project can complete its successful execution at SCAI center, and that's what we're really terming it as a graduation. And there could be really different types of graduation path depending upon each type of project.

    Sridhar Vedantham: OK, let's talk a little bit about some of the projects that you are currently doing under the SCAI umbrella. Because to me from what you've said so far, it sounds like there's probably going to be a fairly wide spread of types of projects, and quite a large variety in the type of things that you're doing there.

    Tanuja Ganu: So yes, Sridhar, that's very true. We are working on a very diverse set of projects right now. And, um, so to give a flavor of our work, I would discuss about two or three projects briefly. The first project is called HAMS that is Harnessing Automobiles for Safety. We all know that road safety is very critical issue and according to World Bank Report globally there are 1.25 million road traffic deaths every year. In India there is one death every 4 minutes. That happens due to road accidents. So, to understand and address this very critical issue of road safety, HAMS project was initiated by our team at MSR, including Venkat Padmanabhan, Akshay Nambi and Satish Sangameswaran. HAMS provides a low cost solution which is being evaluated for automated driver license testing. HAMS includes a smartphone with its associated sensors like camera, accelerometer, etc that is fitted inside a car. It monitors a driver and the driving environment and using AI and edge intelligence, it provides effective feedback on the safe driving practices. So at present, HAMS has been deployed at regional transport office in Dehradun, India for conducting dozens of driver license tests a day, and the feedback from this deployment is very encouraging, since it provides transparency and objectivity to the overall license testing and evaluation process. The second project is in the domain of natural language processing, called Interactive Neural Machine Translation, which was initiated by Kalika Bali and Monojit Choudhury in our NLP team. So, when we look at this problem, there are 7000 plus spoken languages worldwide, and for many many use cases, we often need to translate content from one language to another. Though there are many commercial machine translation tools available today, those are applicable to a very small subset of languages, say 100, which have sufficiently large digital datasets available to train machine learning models. So to aid human translation process as well as for creating digital data set for many low resource or underserved languages, we combine innovations from deep learning and human computer interactions and bring human in the loop. So when we talk about INMT, the initial translation model is bootstrapped using small data set that is available for these languages. And then INMT provides quick suggestions for human translators while they are performing translations. And over time it also helps in creating larger digital datasets which would help in increasing accuracy of translation for such underserved languages. So in INMT we're currently working with three external collaborators called Pratham Books, Translators Without Borders and CGNet Swara to evaluate and enhance INMT. So just to give few examples, Pratham Books is a nonprofit publisher who would like to translate children story books in as many languages as possible. Translators Without Borders is a nonprofit who is working in the areas of crisis relief, health and education, and they would like to evaluate IN&MT for an Ethiopian language called Tigrinya. Our other collaborator CGNet Swara is working with INMT for collecting Hindi Gondi data set. And just to give you one last flavor of one more project…

    Sridhar Vedantham: So I'm sorry, sorry to interrupt, but I was curious, how do you actually go around selecting or identifying partners and collaborators for these projects?

    Tanuja Ganu: So when we started thinking about SCAI projects last year, we had initiated a call for proposals where we invited external partners and collaborators to submit various ideas that they do have and the process that they have in addressing some of the societal impact projects and we Interestingly received a huge pool of applications through this call for proposals we received more than 150 applications through that. And through careful evaluation process, as we discussed earlier, we finally selected a few projects to start under SCAI umbrella.

    Sridhar Vedantham: OK, so I'm sorry I interrupted. You wanted to…you were speaking about another project.

    Tanuja Ganu: Yeah, so just to give one more flavor of the project that we are currently doing which is addressing another important issue of air pollution. So air pollution is another major concern worldwide, with an estimated 7 million deaths every year, and when we look in India, it's even more serious problem since 13 out of 20 most polluted cities in the world are in India. So to solve the air pollution problem, it is important to get correct monitoring of pollution levels, their timely and seasonal patterns in more granular manner, that is, from multiple locations inside the city. So apart from sophisticated and expensive air pollution monitoring stations feature already available, there are low-cost air pollution sensors which are being deployed for this purpose. But the local sensors tend to drift or develop fault overtime and the entire monitoring and analytical insights are dependent on reliability and correctness of this IoT data. So taking these things into account, we are now evaluating our research project called Dependable IoT for these low-cost air pollution sensors. Dependable IoT helps in automatically identifying and validating the drift or malfunction in the sensors and notifies for recalibration or replacement. So currently we are working with a few startups in this space to evaluate dependable IoT Technology and as the project name such as this is not only limited to air pollution sensing, but this technology is applicable for many other use cases for IoT sensing- in agriculture, food technology or in healthcare. So I guess this gives you a view on some of the diverse projects that now we are doing and working on at present in SCAI.

    Sridhar Vedantham: Yeah, so this Dependable IoT thing sounds quite interesting. So correct me if I'm wrong, but essentially, what we're saying is that we're trying to figure out ways in which we can ensure that the data that we're receiving in order to extract information from it and make decisions- we're actually trying to figure out our trying to make sure that the data itself is solid.

    Tanuja Ganu: Absolutely. That's correct, Sridhar, and it's like monitoring the monitor, right? So while we're doing the IoT monitoring and sensing, we need to make sure that the technology that we're putting in place is being monitored and it's giving us reliable and correct data.

    Sridhar Vedantham: Great. Now what's also coming across to me throughout this conversation is that given the variety of projects and the variety of collaborators that you're looking at in SCAI- would I be right in saying that the kind of people that you have in SCAI in addition to the researchers, obviously who are your internal collaborators, the people who are part of SCAI, are they a very diverse and varied set of people?

    Tanuja Ganu: Yes, absolutely true, Sridhar. As we discussed earlier, SCAI’s an interdisciplinary team that consists of social scientists, CS researchers, solid software engineers and designers. And we also have a program called SCAI Fellows where fresh under graduates or the candidates who are already working in the industry can join on the specific SCAI project for a fixed time period and contribute towards the development of SCAI project. So particularly in SCAI, in addition to all these technical or academic skills, we're also looking for people who have passion for societal impact and willingness to do the field work and deployment to scale a research idea.

    Sridhar Vedantham: OK, and you know, you might at any point of time be working on say, four, five or six projects. Uh, what happens to these projects once they are completed?

    Tanuja Ganu: Yeah, so I would say each project would have a different graduation plan. So whenever a project is complete from the SCAI perspective, we call it as a graduation plan where we would define how this project would then sustainably grow further internally or externally. And this graduation plan would be different for different projects depending upon the nature of the project. So for some of the projects, the graduation plan could be an independent entity that is spun off to take the journey of the project forward by scaling the initial idea to more people, more geographies, or for more use cases. A very good example of this type of graduation plan is a MSR project called 99 DOTS, where researchers like Bill Thies and others at Microsoft Research started this project to address medical adherence for tuberculosis. Over the years, this work has significantly grown and there is an independent entity spun off called Everwell to take the 99 DOTS journey forward. The other type of graduation plan can be putting up a work and technology in the open source wherein the external social enterprises, NGOs or our collaborators can build on top of it and take the solution forward at larger scale. The example of this is our work on interactive machine translation, where we have open sourced our initial work and various collaborators are now using, validating and building on top of this technology.

    Sridhar Vedantham: OK, and does the work that you do in SCAI or say the validation that you're looking for from research projects or the validation you're looking at of research projects through SCAI- does that feed back further into the research itself, or does it kind of just stay at SCAI?

    Tanuja Ganu: So, it has two or I would say it would have multiple pathways, but primarily the first thing is certainly the work that we're doing is validating certain research hypothesis that we do have. So some of the output or outcome of these SCAI projects is feeding back into the research areas and validating or invalidating the hypothesis to say how the technology is helping to solve a particular research problem or not. But also if the intervention is successful, it would be useful for external collaborators internally, externally for them to take up this idea forward and utilize the technology that we have built at SCAI to taking it to larger scale.

    Sridhar Vedantham: OK, so once again coming back to the fact that the projects that you do are of such different nature, how do you actually even define success metrics for SCAI projects?

    Tanuja Ganu: Yeah, this is a very interesting question, Sridhar. So, the whole purpose of SCAI, as the name suggests, is about bringing social impact through technology innovations. So there is no one fixed set of metrics that would be applicable for each and every project at SCAI. But our success metrics for these projects are geared towards validating whether technological interventions can support the people and ecosystem and actually help address a specific problem or not. And if it does help solve the problem, then how can we amplify the positive effect using technology? So those are really the metrics that we're defining on each of the project depending upon nature of the project.

    Sridhar Vedantham: So Tanuja, thank you so much for your time. This has been a great conversation and all the best for going forward in SCAI.

    Tanuja Ganu: Thank you, Sridhar, for having me here and I really enjoyed discussing these projects and ideas with you. Thank you.

    [Music Ends]

  • Microsoft Research India Podcast – Podcast (13)
    Making cryptography accessible, efficient and scalable. With Dr. Divya Gupta and Dr. Rahul Sharma7 sep 2020· Microsoft Research India Podcast

    Episode 005 | September 08, 2020

    Podcast: Making cryptography accessible, efficient and scalable. With Dr. Divya Gupta and Dr. Rahul Sharma

    Ensuring security and privacy of data, both personal and institutional, is of paramount importance in today’s world where data itself is a highly precious commodity. Cryptography is a complex and specialized subject that not many people are familiar with, and developing and implementing cryptographic and security protocols such as Secure Multi-party Computation can be difficult and also add a lot of overhead to computational processes. But researchers at Microsoft Research have now been able to develop cryptographic protocols that are developer-friendly, efficient and that work at scale with acceptable impact on performance. Join us as we talk to Dr. Divya Gupta and Dr. Rahul Sharma about their work in making cryptography easy to use and deploy.

    Dr. Divya Gupta is a senior researcher at Microsoft Research Lab. Her primary research interests are cryptography and security. Currently, she is working on secure machine learning, using secure multi-party computation (MPC), and lightweight blockchains. Earlier she received her B.Tech and M.Tech in Computer Science from IIT Delhi and PhD in Computer Science from University of California at Los Angeles where she worked on secure computation, coding theory and program obfuscation.

    Dr. Rahul Sharma is a senior researcher in Microsoft Research Lab India since 2016. His research lies in the intersection of Machine Learning (ML) and Programming Languages (PL), which can be classified into the two broad themes of “ML for PL” and “PL for ML”. In the former, he has used ML to improve reliability and efficiency of software. Whereas, in the latter, he has built compilers to run ML on exotic hardware like tiny IoT devices and cryptographic protocols. Rahul holds a B.Tech in Computer Science from IIT Delhi and a PhD in Computer Science from Stanford University.

    Click here for more information in Microsoft Research’s work in Secure Multi-party Computation and here to go to the GitHub page for the project.

    Related

    Microsoft Research India Podcast: More podcasts from MSR IndiaiTunes: Subscribe and listen to new podcasts on iTunesAndroidRSS FeedSpotifyGoogle PodcastsEmailTranscript

    Divya Gupta: We not only make existing Crypto out there more programmable and developer friendly, but we have developed super-duper efficient cryptographic protocols which are tailored to ML, like secure machine learning inference task and work for large machine learning benchmarks. So before our work, the prior work had three shortcomings I would say. They were slow. They only did small machine learning benchmarks and the accuracy of the secure implementations was lower than the original models. And we solved all three challenges. So our new protocols are at least 10 times faster than what existed out there.

    [Music]

    Sridhar: Welcome to the Microsoft Research India podcast, where we explore cutting-edge research that’s impacting technology and society. I’m your host, Sridhar Vedantham.

    [Music]

    Ensuring security and privacy of data, both personal and institutional, is of paramount importance in today’s world where data itself is a highly precious commodity. Cryptography is a complex and specialized subject that not many people are familiar with, and developing and implementing cryptographic and security protocols such as Secure Multi-party Computation can be difficult and also add a lot of overhead to computational processes. But researchers at Microsoft Research have now been able to develop cryptographic protocols that are developer-friendly, efficient and that work at scale with acceptable impact on performance. Join us as we talk to Dr. Divya Gupta and Dr. Rahul Sharma about their work in making cryptography easy to use and deploy.

    Sridhar Vedantham: Alright, so Divya and Rahul, welcome to the podcast. It's great to have you guys on the show and thank you so much. I know this is really late in the night so thank you so much for taking the time to do this.

    Divya Gupta: Thanks Sridhar for having us. Late is what works for everyone right now. So yeah, that's what it is.

    Rahul Sharma: Thanks Sridhar.

    Sridhar Vedantham: Alright, so this podcast, I think, is going to be interesting for a couple of reasons. One is that the topic is something I know next to nothing about, but it seems to me from everything I've heard that it's quite critical to computing today, and the second reason is that the two of you come from very different backgrounds in terms of your academics, in terms of your research interests and specialities, but you're working together on this particular project or on this particular field of research. So let me jump into this. We're going to be talking today about something called Secure Multi-party Computation or MPC. What exactly is that and why is it important?

    Divya Gupta: Right, so Secure Multi-party Computation and as you said, popularly known as MPC, is a cryptographic primitive, which at first seems completely magical. So let me just explain with an example. So let's say you, Sridhar, and Rahul are two millionaires and you want to know who has more money or who's richer. And you want to do this without revealing your net worth to each other, because this is private information. So at first this seems almost impossible. As in how can you compute a function without revealing the inputs of the function? But MPC makes this possible. What MPC gives you is an interactive protocol in which you and Rahul will talk to each other back and forth, exchanging some random looking messages. And at the end of this interaction you will learn the output, which is that who is richer and you will only learn the output alone. So this object MPC comes with the strong mathematical guarantees which say that at the end of this interaction only the output is revealed, and anything which can be deduced from output, but nothing else about the input is revealed. So in this example, Sridhar, you and Rahul both will learn who is richer. And let's say you turn out to be richer. Then of course from this output you would know that your net worth is more than Rahul’s and that's it. Nothing else you will learn about Rahul’s net worth. So this is what MPC is. This example is called the Millionaire’s Problem, where the function is very simple. You're just trying to compare two values, which is the net worth. But MPC is much more general. So just going into a bit of history, I would say that MPC can compute any function of your choice on secret inputs. And this result in fact was shown as early as 1980s and this power of MPC, of being able to compute any function securely, got many people interested in this problem. So a lot of work happened and people kept coming up with better and better protocols which were more efficient. So when I say efficient, some of the parameters of interest are the data being sent in the messages back and forth. The number of messages you want to exchange, and also the end to end latency of this protocol, like how much time does it take to compute the function itself, And people kept coming with better and better protocols. And finally, the first implementations came out in 2008 and since then, people have evaluated a few real world examples using MPC and one example which I found particularly interesting is the following, which was a social study which was done in a privacy preserving manner using MPC in Estonia in 2015.

    So, the situation was as follows.

    Along with the boom in information and communication technology, it was observed that more and more students were dropping out of college without finishing their degree. And the hypothesis going around was that the students, when they are studying in the University, get employed in IT jobs and they start to value their salaries more than their University degree and hence drop out. But a counter hypothesis was that it is because IT courses are gaining popularity, more and more students are enrolling into it and find it hard and drop out. So the question was, is working during studies in IT jobs correlated with high dropout rate?

    And to answer to answer this question, a study was proposed to understand the correlation between early employment of students in IT jobs while being enrolled in University and high dropout rate. Now this study can be done by taking in data from employment records in the tax department and also the enrollment records in the education department and just cross referencing this data. So even though all of this data is there with the government, it could not be shared in the clear between the two departments because of legal regulations and the way they solve this problem is by doing this Secure Multi-party Computation between Ministry of Education and tax board.

    So this, I feel, is an excellent example which shows that MPC can help solve real problems where data sharing is important but cannot be done in the clear.

    Sridhar Vedantham: OK. Rahul was there something you wanted to add to that?

    Rahul Sharma: Yes, Sridhar. So if you realized what is happening today, the data is being digitized. Financial documents, medical records. Everything is being digitized, so we are getting, you can say, flood of data which is being available and the other thing which has happened in computer science is that we have now very, very powerful machine learning algorithms and very powerful hardware which can crunch these machine learning algorithms on this huge amount of data. And so machine learning people have created, for example, machine learning models which can beat human accuracy on tasks in computer vision. Computer vision is basically you have an image and you want to find some pattern in that image. For example, does the image belong to a cat or a dog? And now we have classifiers which will beat humans on such tasks. And the way these machine learning classifiers work is they use something called supervised machine learning, which has two phases. One is a training phase and one is an inference phase. In the training phase, machine learning researchers- they curate the data, they collect the data and they throw a lot of hardware on it to generate a powerful machine learning model. And then there is an inference phase in which new data points come in and the model labels or makes predictions on these new input data points. Now, after you have gone through this expensive training phase, the companies or the organizations who do this want to monetize the model which they have obtained. Now, if you have to monetize this model, then you have two options. One is that you can just release the model to the clients who can just download the model and run the model on their private data.

    Now, if they do this, then first of all, the company is not able to monetize the model because the model has just been given away and second, all the privacy of the training data which was used to generate this model is lost because now someone can look at try to look at the model and try it to reverse engineer where the training data was. So this is not a good option. Another option is that the organization can hold the model as a web service and then the clients can send their data to the company for predictions. Now, this is also not a good option because, first of all, clients will have to reveal their sensitive data to the organization holding the model and moreover the organization itself would not like to have this client data because it is just red hot, right? If they hold client data and there is a data breach, then there are legal liabilities. So here we have a situation that there is an organization. It has a model which is its own proprietary model and we have clients who have all their sensitive data and these two parties don't want to reveal their inputs to each other. But still, the organization wants to provide a service in which the client can give the data, receive predictions, and in exchange for the prediction, the client can give some money to the organization. And MPC will help achieve this task.

    So what I think is that MPC will enable machine learning to reach its full potential because machine learning is always hampered by the issues of data privacy and with the MPC combined with machine learning, the data privacy issues can be mitigated.

    Sridhar Vedantham: Interesting, that's really interesting. Now obviously this sounds like a great thing to be able to do in this in this day and age of the Internet and machine learning. Uh, but it sounds to me that, uh, you know, given that you have so many people from the research community working on it, there have got to be certain challenges that you need to first overcome to make this practical and usable, right? Why don't you walk me through the issues that currently exist with implementing MPC at scale?

    Rahul Sharma: What you said, Sridhar, is exactly correct. So there are three issues which come up. They are summarized as efficiency, scalability, and programmability. So what is efficiency? The thing is that if you have a secure solution, it is going to be slower than an insecure solution. Because the insecure solution is not doing anything about security. When implementing a secure solution, you are doing something more to ensure the privacy of data and so there is going to be a performance overhead and that's the first issue that we want the MPC protocols to have a bearable overhead and which is what Divya said that people have been working on it for decades to bring that overhead down.

    The second is that machine learning models are becoming bigger and bigger and more and more complicated. So what we want to do is take these MPC protocols and scale them to the level of machine learning which exists today. And the third challenge, which I believe is the most pressing challenge, is that of programmability. So when we think of these MPC protocols, who is going to implement them at the end of the day? If it is a normal developer, then we have a problem because normal developers don't understand security that much. There was a case in which there was a web forum post in which a person said that, “Oh, I need to ship a product. I'm going to miss the deadline. I'm getting all these security warnings. What should I do?”. And a Good Samaritan came in and said, “Oh you are calling this function with value one. Just call it with the value 0 and the error should go away.” And then the developer replied, “Great. Now I'm able to ship my product. All the warnings have gone away. You saved my life,” and so on. Now in switching that one to zero, what happened was the developer switched off all security checks, all certificate checking, all encryption, everything got switched off, so MPC protocols can be good in math, but when given to normal developers, it's not clear whether normal developers will be able to implement these MPC protocols.

    Divya Gupta: Actually, I would like to chime in here. So what Rahul said is a great story and rather an extreme one. But, uh, I, as a cryptographer, can vouch for the fact that cryptography as a whole field is mathematically challenging and quite subtle. And many a times like even we experts come up with C protocols which at the face of it looks secure and seems like there's no issues at all. But as soon as we start to dive deeper and try to prove security proofs of the protocol and so on, we see that there are big security vulnerabilities which cannot be fixed. So I cannot stress enough that when it comes to Crypto, it is very, very important to have rigorous proofs of correctness and security. And even small, tiny tweaks here and there which look completely harmless can completely break the whole system. So it is completely unreasonable to expect people or developers who have had no formal training in Crypto or security to be able to implement these crypto protocols correctly and securely and so on. And this in fact we feel is one of the biggest challenge, which is the technical challenge to deploy MPC to do real world applications.

    Sridhar Vedantham: Interesting, so I've got a follow-up question to something that you both just spoke about. Obviously, the cryptographer brings in the whole thing about the security and how to make secure protocols, and so on and so forth. What does the programming languages guy or the ML person bring to the table in this scenario?

    Rahul Sharma: Yeah, so I think that's a question for me since I work in the intersection of compilers and machine learning. So if I put my developer hat on and someone tells me that implement these MPC protocols written in these papers. I will be scared to death. I'm pretty sure I will break some security thing here or there. So I think the only way to get secure systems is to not let programmers implement those secure systems. So what we want to do is we want to build compilers which are automatic tools which translate programs from one language to another so that programmers write their normal code without any security like they are used to writing and then the compiler does all the cryptography and generates MPC protocols. So this is where a compiler person comes in to make the system programmable by normal programmers.

    [Music]

    Sridhar Vedantham: OK, so let's be a little more specific about the work that both of you have actually been doing over the past few years, I guess. Could you talk a bit about that?

    Rahul Sharma: So, continuing on the compiler part of the story. So first we build compilers in which developers can write C-like code, and we could automatically generate secure protocols out, and this gives a lot of flexibility because C is a very expressive language and you can do all sorts of different computations. But then we realized that the machine learning people don't want to write in C. They want to write in their favorite machine learning frameworks like Tensorflow or PyTorch and ONNX. So what we did is build compilers which take machine learning models written in Tensorflow, PyTorch, ONNX and compile them directly to MPC protocols and the compilers which we built have some good properties. First of all, they're accuracy preserving, which means that if you run insecure computation and you get some accuracy, and if you run secure computation, then you get the same accuracy. Now, this was extremely important because these machine learning people care for every ounce of accuracy. They can live with some overhead- computational overhead- because of security, but if they lose accuracy, that means the user experience gets degraded, they lose revenue. That is just a no go. So our compiler ensures that no accuracy is lost in doing the secure execution. Moreover, the compiler also has some formal guarantees, which means that even if the developer unintentionally or inadvertently does something wrong, which can create a security leak, then the compiler will just reject the program, which means that now developers can be confident that when they use our framework that if they have written something and it is compiling then it is secure.

    Divya Gupta: So as I think Sridhar already pointed out that this is a project which is a great collaboration between cryptographers and programming languages folks. So we not only make advances on the programming languages front, but also on the cryptography side. So we make progress on all three challenges which Rahul mentioned before, which are efficiency, scalability and programmability. So we not only make existing Crypto out there more programmable and developer friendly, but we have developed super-duper efficient cryptographic protocols which are tailored to ML, like secure machine learning inference task and work for large machine learning benchmarks. So before our work, the prior work had three shortcomings, I would say. They were slow. They only did small machine learning benchmarks and the accuracy of the secure implementations was lower than the original models. And we solved all three challenges. So our new protocols are at least 10 times faster than what existed out there. We run large ImageNet scale benchmarks using our protocols. So ImageNet data set is a standard machine learning classification task where an image needs to be classified into one of thousand classes, which is even hard for a human to do. And for this task we take the state-of-the-art machine learning models and run them securely. And these models are again at least 10 times larger than what the prior works did securely. And finally, all our secure implementations, in fact, match the accuracy of original models, which is very important to ML folks. And all of this could not have been possible without our framework, which is called CryptFlow, which again would not have been possible without a deep collaboration between cryptographers and programming languages folks.

    So this, I think, summarizes well what we have achieved in the last few years with this collaboration.

    Sridhar Vedantham: That's fantastic. Rahul, you wanted to add to that?

    Rahul Sharma: I want to add a little bit about the collaboration aspect, which Divya mentioned. So this project was started by Nishanth Chandran, Divya, Aseem Rastogi and me at MSR India, and all of us come from very different backgrounds. Divya, Nishanth are cryptographers, I work at the intersection of machine learning and programming languages, Aseem works in intersection of programming languages and security. And since all of us came together, we could solve applications or scenarios with MPC much better because given a scenario, we could find out that should we fix the compiler or should we fix the cryptography, and our meetings are generally sword fights. We would fight for hours on very, very simple design decisions, and the final design we came up with is something which all of us are very happy with and this wouldn't have been possible if we did not have our hard-working Research Fellows. And we had a fantastic set of interns which worked on this project.

    Sridhar Vedantham: Fantastic, and I think that's a great testament to the power of interdisciplinary work. And I totally can buy what you said in terms of sword fights during research meetings. Because, while I've not sat through research meetings myself, I have certainly attended research reviews so I can completely identify with what you're saying from what I've seen myself. Alright, so this one thing that I wanted to kind of clarify for myself and I, and I think for the benefit of a lot of people who would be listening. You know when you say things like the complexity decreases and we can run things faster and the overheads are less and so on, these concepts sound fairly abstract to people who are not familiar with the area of research. Could you put a more tangible face to it in terms of you know when you're saying that we reduce overheads, is there a certain percentage or can you give it in terms of time and so on?

    Divya Gupta: Right so when you talk about efficiency of our protocols, we measure things like end to end runtimes of them, like how much time does it take for the whole function to run securely and this depends on things like the amount of data being transferred in the messages which are being exchanged between different parties. So just to take an example from our latest paper to appear at CCS this year, we built new protocols for the simple Millionaire’s Problem, which I described in the very beginning. And there we have almost 5X- five times improvement in just the communication numbers. And this translates to run times as well. And now when I look at this Millionaire’s, this is a building block to our other protocols. So in a machine learning task, let's say there is a neural network. Neural network consist of these linear layers which look like matrix multiplications or convolutions and also some nonlinear operators which are, let's say rectified linear units (or ReLU) or MaxPool etc. And in all of these nonlinear layers you have to do some kind of comparison on secret values, which essentially boils down to doing some kind of Millionaire’s Problem. So whatever improvements we got in in the simplest setting of Millionaire’s translate to these more complicated functions as well. And in fact, our improvement for more complicated functions are much better than just the Millionaire’s and there we have almost 10 times improvement in the communication numbers. And when you’re actually running these protocols over a network, communication is what matters the most, because like compute is local you can parallelize it, you can run it on heavy machines and so on, but communication is something which you cannot essentially make go faster. So all our protocols have been handmade and tailored to the exact setting of functions which occur in the neural networks. And we improve the communication numbers and hence the other parameters of the runtimes as well.

    Sridhar Vedantham: OK, thanks for that. It certainly makes things a little clearer to me. Because to me a lot of this stuff just sounds very abstract unless I hear some actual numbers or some real instances where these things impact computation and actual time taken to conduct certain computations.

    Divya Gupta: Right, so just to give you another example, our task of ImageNet classification, right which I talked about? We took state of the art models there and our inference runtime end to end was under a minute. So this shows that it doesn't run in seconds, but it definitely runs under a minute, so it is still real, I would say.

    Sridhar Vedantham: Right, so Divya, thanks. I mean, that certainly puts a much more tangible spin on it, which I can identify with. Are there any real-life scenarios in which you see MPC bringing benefits to people or to industry etc? Right now in the real- you know, in in the near term.

    Rahul Sharma: So Sridhar, I believe that MPC has the potential to change the way we think about healthcare. So if we think of, for example, a hospital, which has trained a model that, given a patient image, it can tell whether the patient has COVID or pneumonia, or whether the patient is alright. Now, the hospital can post this model as a web service and what I can do- I can go to my favorite pathological lab, get a chest X-Ray done and then I can do a multi-party computation with the hospital and my sensitive data which are my chest X-Ray images will not be revealed at all to the hospital and I will get a prediction which can tell me how to go about doing the next steps. Now this task, we have run it actually with MPC protocols and this runs in a matter of minute or two. So, a latency which is quite acceptable in real life. Other applications which we have looked at is- one is detecting diabetic retinopathy from retina scans. We have also run machine learning algorithms which can give you state of the art accuracies in terms of detecting about 14 chest diseases from X-Ray images and the most recent work which we have done is in tumor segmentation. So there what happens is that the doctor is given a 3D image and the doctor has to mark the boundary of the tumor in this 3D image. So it is like a volume which the doctor is marking. Now this is a very intensive process and takes lot of time and one can think of training a machine learning model which can help the doctor do this task- the machine learning model will mark some boundary and then the doctor can just fine tune the boundary or make minor modifications to it and approve the boundary. Now we already have machine learning algorithms which can do this, but then again patients will be wary of giving their 3D scans to the model owners. So what MPC again can do is that they will be able to do this task securely without revealing the 3D scan to the organization which owns the machine learning model, and this we can do in a couple of hours. And to put things in perspective, doctors usually get to a scan in a matter of couple of days. So again, this latency is acceptable.

    Divya Gupta: So another domain of interest for MPC is potentially finance and we all know that banks are highly secretive entities, for the right reasons, and they cannot and do not share the data even with other banks. And this makes many tasks quite challenging, such as detecting fraudulent transactions and detecting money laundering as the only data available is the bank’s own data and nothing else. What MPC can enable is that the banks can pool in their data and do fraud detection and detection of money laundering together on all the banks’ data and at the same time no bank’s data would be revealed in the clear to any other bank. So all this can happen securely and still you can reap benefits from pooling in data of all the banks. And in fact, many of these tasks like money laundering, actually works by siphoning money through multiple banks so you indeed need the data of all the banks. What I'm trying to get at is that the power of MPC is very general, and as long as you and I have some secret data which we do not want to reveal to each other but at the same time we want to pool in this data together and compute some function jointly so that it benefits us both, MPC can be used.

    Sridhar Vedantham: So this sounds fantastic and it also sounds like there's a huge number of areas in which you can actually deploy and implement MPC, and I guess it's being made much easier now that you guys have come up with something that makes it usable, which it wasn't really earlier. So, are the research findings and the research work that you guys have done, is it available to people outside of Microsoft? Can the tech community as such be able to leverage and use this work?

    Divya Gupta: Yes, actually fortunately all of our protocols and work has been published at top security conferences and is available online and all the code is also available on GitHub, so if you have a secure inference scenario, you can actually go out there and try this code and code up your application.

    Sridhar Vedantham: Excellent, so I think what we'll also do is provide the links to resources that folks can access in the transcript of this podcast itself. Now, where do you guys plan to go with this in the future and what are your future research directions, future plans for this particular area?

    Rahul Sharma: So, going back to machine learning. As I said there are two phases. There's a training phase and there is the inference phase, and we have been talking mainly about the inference phase till now, because that is what we have focused on in our work. But the training phase is also very important. Suppose there are multiple data holders, for example, take multiple hospitals and they want to pool in their data together to train a joint model. But there can be legal regulations which prohibit them from sharing data indiscriminately between each other. So then they can use MPC to train a model together. Then I've heard like bizarre stories like nurses will sit down with permanent marker and where they will be just redacting documents and there will be legal agreements which will take years to get through and MPC just provides a technological solution to do this multi-party training.

    Divya Gupta: So, we live in a world where security is a term which gets thrown around a lot without any solid backing. And to make MPC real, we feel that we have to educate people and businesses about the power of MPC and what security guarantees it can provide. So as an example, let's take encryption. I think most people, businesses and even law understands what encryption is, what guarantees it provides, and as a result, most real-world applications use end to end encryption. But if I ask a person and say the following that there are two parties who have the secret input and they want to compute some function by pooling in their inputs, how do I do this? And the most likely answer I would get would be that the only solution possible out there is to share the data under some legal NDAs. Most people just simply don't know that something like MPC exists. So I'm not saying that MPC would be as omnipresent as encryption, but with this education we can put MPC on the table and people and businesses can think of MPC as a potential solution to security problems. And in fact, as we talk to more and more people and educate them about MPC new scenarios are discovered which MPC can enable. And moreover, with regulations like GDPR which are aimed at preserving privacy, and also bigger and bigger ML models which need more and more data for more accuracy, we feel that MPC is a technology which can resolve this tension.

    Sridhar Vedantham: Excellent, this has been a really eye opening conversation for me and I hope the people who listen to this podcast will learn as much as I have during this. Thank you so much, Divya and Rahul. I know once again- so once again I'm just going to say that it's really late and I totally appreciated your time.

    Divya Gupta: Thanks Sridhar, thanks a lot for having us here.

    Rahul Sharma: Thanks Sridhar, this was fun.

    [Music Ends]

  • Microsoft Research India Podcast – Podcast (14)
    Can we make better software by using ML and AI techniques? With Chandra Maddila and Chetan Bansal3 aug 2020· Microsoft Research India Podcast

    Episode 004 | August 04, 2020

    Podcast: Can we make better software by using ML and AI techniques? With Chandra Maddila and Chetan Bansal

    The process of software development is dramatically different today compared to even a few years ago. The shift to cloud computing has meant that companies need to develop and deploy software in ever shrinking timeframes while maintaining high quality of code. At the same time, developers can now get access to large amounts of data and telemetry from users. Is it possible for companies to use Machine Learning and Artificial Intelligence techniques to shorten the Software Development Life Cycle while ensuring production of robust, cloud-scale software? We talk about this and more with Chandra Maddila and Chetan Bansal, who are Research Software Development Engineers at Microsoft Research India.

    Click here for more information on Project Sankie.

    Related

    Microsoft Research India Podcast: More podcasts from MSR IndiaiTunes: Subscribe and listen to new podcasts on iTunesAndroidRSS FeedSpotifyGoogle PodcastsEmail

    Transcript

    Chandra Maddila: One of the biggest disconnects we used to have in boxed product world where we used to ship software as a standalone product and give it to customers is, once customer takes the product, it is in their environment, we don’t have any idea about how it is being used and what kind of issues people are facing unless they come back to Microsoft support and say, “Hey, we are using this product, we get into these issues, can you please help us?”. But with the advent of services, one of the beautiful things that happened is, now we have the ability to collect telemetry about various issues that are happening in the service. So, this helps us pro-actively fix issues and help customers mitigate outages and also join the telemetry data from deployment side of the world all the way into coding phase, which is the first phase of software development life cycle.

    [Music]

    Sridhar: Welcome to the Microsoft Research India podcast, where we explore cutting-edge research that’s impacting technology and society. I’m your host, Sridhar Vedantham.

    [Music]

    Sridhar: Chandra and Chetan, welcome to the podcast. And thank you for making the time for this.

    Chetan: Thanks, Sridhar, for hosting this.

    Chandra: Thanks Sridhar, thanks for having us.

    Sridhar: Great! Now, there is something that’s interested me when I decided to host this podcast with you guys. You are both research software development engineers and Microsoft research is known for being this hardcore computer science research lab. So, what does it mean to be a software developer in a research org like MSR? And how is it different than being a software developer in say, a product organization, if there is a difference?

    Chetan: Yeah, that’s a great question, Sridhar about the difference between the RSDE role which is research software developer engineer at MSR vs. the product groups at Microsoft. In my experience the RSDE role is sort of open ended. Because often times, research teams work on open ended research problems. So, the RSDE engineers often work on things like prototypes and building products from the ground up which are deployed internally and which are the pre-cursor for products which are shipped to our customers, so there’s a lot of flexibility and openness in terms of what the RSDEs work on, and it can range from open ended research to actually building products which are shipped to our customers. So, there’s a wide spectrum of things and roles which RSDE plays.

    Sridhar: Chandra, what’s your take on that?

    Chandra: I think Chetan summarized it pretty well. RSDE in general is much more flexible compared to a typical software engineer role in products groups. You can switch from areas to areas and products to products. I, for example was working on NLP for some time, then web applications, learning platforms for some time. Then, I switched to software engineering. So, we have this flexibility to move across different areas and also, one thing we I think do as RSDEs is working on long-term problems, problems from ground up which takes some time to incubate and productize, whereas software engineers and product groups have well defined scope and well defined problems which are aligned to their product’s vision. So, that way they have slightly more constraint in terms of what kind of problems they work on. But, at the same time of the greatest advantages people in product groups have is the accessibility to customers. They are very close to customers and they really work on customer problems and ship things quite faster, whereas RSDEs in MSR don’t have access to direct customers.

    Sridhar: Interesting, so it sounds like it’s kind of a play between customer access and freedom as far as RSDEs are concerned.

    Chandra: Yeah, as RSDEs in Microsoft research, we have lot more flexibility and provision to explore more interesting areas in research, new and upcoming areas like probably, quantum computing or block chain or advances in AI/ ML etc and do more exploratory things.

    Chetan: Just wanted to add another thing here. A lot of times, people have misconceptions that in Microsoft Research or in other research organizations, a doctorate or Ph.D is required to get a job or to work for these organizations. But there are roles such as RSDEs, and product managers, program managers or even designers which people can take on without the need to have a Ph.D or a doctorate and they can still contribute to the research happening in companies like Microsoft.

    Sridhar: Great. Now, we keep hearing now-a-days that the process of software development has changed tremendously over the last few years. So, what’s actually caused these changes?

    Chetan: I think, to start with there are two things which in my opinion have caused this sort of revolution in the software development industry. One of them is the move to the services-oriented world, so we are no longer shipping boxed products in a CD or a DVD. But we are actually shipping services, we are actually selling services which are used by our customers unlike before where you ship a software and that’s used by our customers for couple of years and then they update it. So, I think that’s one key change which has happened in the last decade and the other major paradigm shift which has happened is the move to cloud. So, even in terms of software deployment, today it’s being done on cloud instead of on-prem, which is within the premises of a customer or a company. So, that has brought in a whole range of changes in terms of how a software is developed, deployed, and maintained within small and big companies like even Microsoft. And today startups and any new company doesn’t have to actually spend a lot of money in capex, capital expenditure on buying servers or hiring people to maintain the servers, but they can basically ship and operate out of cloud which saves a lot of money and time. So, in my opinion, these are the two major paradigm shifts which has happened and which has positively impacted the software industry.

    Chandra: Compared to 90’s, when we used to for instance, ship boxed products, now everything is becoming a service, that is also primarily driven by customer expectation. So, these days customers are expecting companies to actually ship services more faster, make the new features available at a much faster pace which is also accelerated by the development and growth in cloud computing technologies which makes software companies or software developers to scale the services really fast and serve more people and ship things much faster.

    Sridhar: So, I know for a fact that earlier there used to be these long ship cycles where somebody would develop some software, and there would be a bunch of people testing it and after which it would reach the customer, whether it would be the retail customer or the enterprise customer, right. I think, a lot of these processes have either disappeared or been extremely compressed. So, what kind of challenges and opportunities do these changes provide you guys as software developers?

    Chandra: So, these rapid development models where people are expected to ship really fast brought down the overall ship cycles, the duration of the ship cycles down, to even like days, or in a single day, you experience the entire software development life cycle, all the steps of the development life cycle starting from coding, to testing to deployment in a single day. This definitely poses lot of challenges because you have to make sure, you are shipping fast, but at the same time you are making sure your service is stable and customers are not experiencing any interruptions. So, you need to build tools, and services that aid developers to achieve this. So, the tools and services has to be pretty robust and make sure they catch all the catastrophic bugs early on and developers to achieve this feat of shipping their services much faster. So, the duration between someone writing the code and the code hitting the customer has come down significantly, which is what we all need to make sure we support.

    Chetan: I just want to add two more things- two more changes which have helped evolve the software development life cycle and processes. First is the possibility of collecting telemetry and data from our users. So, basically, we are able to observe how our features or our code is been behaving or being used in near real time which allows us to see if there is any regression or if there are any changes or if there are any bugs which needs to be fixed. This wasn’t possible in the past within the boxed software world because we didn’t have access to the telemetry. The second aspect is having a set of users which are helping you test your features and services at the same time. So, now, we can sort of do software development in parallel as we roll out our current set of features.

    Sridhar: Cool. So, it sounds like you guys are now able to get a large amount of data as well as telemetry from the users, right. How does this actually help in making the software development life cycle more efficient or faster?

    Chetan: So, I think there are two aspects. Like, one of them which I just highlighted was, now we are getting real-time or near real-time telemetry in terms of how different aspects of our software or services are being used. And the second is, if there is any regressions or any anomalies which are happening, we are able to detect that and then resolve that very quickly which wasn’t possible before. So, I think these are the two aspects.

    Chandra: One of the biggest disconnects we used to have in boxed product world where we used to ship software as a standalone product and give it to customers is, once customer takes the product, it is in their environment. We don’t have any idea about how it is being used and what kind of issues people are facing unless they come back to Microsoft support and say, “Hey, we are using this product, we get into these issues, can you please help us?”. But with the advent of services, one of the beautiful things that happened is, now we have the ability to collect telemetry about various issues that are happening in the service. So, this helps us pro-actively fix issues and help customers mitigate outages and also join the telemetry data from deployment side of the world all the way into coding phase, which is the first phase of software development life cycle and give valuable insight to developer so that in the code itself, they have an understanding of how this code is going to behave out there in the wild and be more cautious and cause less bugs or issues.

    [Music]

    Sridhar: There have been a couple of terms which have become, I think very predominant, very prominent over the last few years. There are two terms that come to mind immediately to me, one is DevOps and the other is AIOps. What exactly are these?

    Chetan: So, DevOps is basically a commonly used term across the software development industry which refers to basically the set of practices and tools for developing software, deploying software and shipping software. So basically, how different parts of our industry, different companies are actually building software, what are the set of practices, for example, how do you do code reviews, how do you check in code, how do you deploy the code, so, different set of practices and also the tools and infrastructure which is involved. So, in my opinion, that’s sort of the definition of DevOps. It’s a very abstract term which refers to different sets of practices and tools for software development. Lastly, AIOps, that’s basically a recently introduced term, probably in the last few years where because of the access to telemetry and data from our software and users, we are able to leverage data science and machine learning for optimizing a lot of key aspects of the DevOps life cycle. For instance, while doing code reviews, can we use machine learning and data science for catching bugs? That’s a very simple example that gives an idea that how AIOps or Artificial intelligence can be used to help different aspects of DevOps. And that’s branded as AIOps.

    Chandra: So, DevOps, actually is a combination of two words, right, Development plus Operations. In box product world, companies were shipping software through CD’s or DVD’s as Chetan mentioned, we used to develop software and sell it to customers. And all the operational aspects of the software, that is, deploying the software in their organizations and maintaining it and making sure the software is running properly etc is in the hands of the customer who takes the software from the vendors like Microsoft. But, with the advent of services, Microsoft is also becoming a services provider. Like Satya famously says, Microsoft is now a services company and we provide solutions to customers. So, we definitely got into this innate need of doing operations also inside Microsoft itself which makes us do both the development and operations together, DevOps, inside Microsoft itself. So, this basically combines different aspects of software development life cycle starting from coding, testing and also deployment and customer support and filling the feedback loop back into development and iterating over all these phases again and again. AIOps is a term that has been coined in the last couple of years. AIOps specifically means, using technologies like Artificial Intelligence and Machine Learning and leveraging that to solve problems and operational challenges in software development. For instance, you take a fancy AI algorithm and use it to solve root causing problem in software services. That is a classic example of using AI for solving a real problem in operations. And we have a variety of different problems that occurs in the operations side of the software development now because of the scale at which software development is happening and using and applying AI/ ML techniques to solve those PROBLEMS, put together can be called as AIOps.

    Sridhar: Ok. Now, I know you guys have been working for a few years on this very interesting research project called Sankie and I think this has elements of using AI and machine learning in making the SDLC more effective. Talk a bit about that.

    Chandra: Sankie is a project which we started at the end of 2016. One of the primary goals of Sankie is to provide an ability to join various data that is being collected at different phases of software development life cycle and leverage techniques like AI/ML, do analysis on top of the data and provide valuable insights which can aid various stakeholders in each phase of these software development life cycle.

    Chetan: I think Chandra put it in a great way that Sankie was started, The whole motivation behind Sankie was to infuse AIOps into the software development processes across Microsoft. And it has been a huge collaborative effort with several collaborators such as [B Ashok, Rahul Kumar, Ranjita Bhagwan, Sonu Mehta, Jim Kleewein] and even several research fellows who have worked with us and collaborated with us over the last several years and our counterparts from different parts of Microsoft. and not just these folks but also several research fellows across MSR and other counterparts across Microsoft.

    Sridhar: Ok. Now, I get the feeling that both of you have kind of over simplified what Sankie is actually. I’ve sat through various talks in which there seems to be huge amount of work that goes in at different components that feed into Sankie which seems to be kind of like a platform. Why don’t you guys talk a little more about what Sankie actually is and what the different constituent parts are, so to speak?

    Chandra: So, Sankie is actually a platform that we have been building. Sankie basically has loaders that ingests data from various phases of software development life cycle, for instance from development phase, it ingests data about pull requests, commits, various builds, from testing phase it ingests data about test cases, test executions, what is the status of the tests, and from deployment phase, it ingests data about alerts, exceptions, and various other telemetry that is collected at the deployment phase and we basically put all these data together in a single queriable data source. That is very important because this data exists in various disparate data sources which are exposed at various levels and Sankie basically gets all this data into a single relational data store which can be easily queried and joined against each other. Then, we use this data, we feed it into various AI and ML tools to provide insights and recommendations in various phases of software development life cycle. For example, we mine all the commit data, that is which files are changed together, which files go in to a pull request etc, to basically discover rules that explains the files that are always changed together and we use that knowledge to provide recommendations when developers are creating pull requests, if they are missing any files to include in their pull requests. We call it as related files analysis. Similarly, we developed tools like ORCA, Online Root Cause Analysis tool which is intended towards root causing service incidents and service disruptions as quickly as possible. So, in case of ORCA, it is pretty interesting that it uses data from both left side of the software development life cycle and right side of the software development life cycle, that is data from commits, and code that is written and the differences of code and the telemetry that is collected at the deployment side, that is the exceptions, errors that are occurring in the service. So, ORCA basically takes all these exceptions, errors that are happening and has an ability to point them towards the actual code change that introduced these problems in first place, which is pretty fascinating because this greatly reduces the amount of time developers spend in root causing issues which typically takes probably couple of days or sometimes even weeks depending on the complexity of the issue. And Sankie has close to 8 such recommenders which combines data from various different phases of the SDLC and leverages the AI/ML techniques, the AIOps processes and make the entire development life cycle more optimal and efficient.

    Chetan: So, to add to what Chandra just said about Sankie, I just want to mention that in the beginning of the podcast we briefly discussed how the move to cloud and service oriented software development has posed some new and interesting challenges for software development. But in this case, we are actually able to use that to our advantage since in Sankie we are basically building services which we can deploy and iterate on very fast , based on the feedback from our users, and also based on the telemetry we are getting from the services. And lastly, because of this cloud oriented architecture, we are able to leverage our big data technologies and the service oriented architectures which allow us to leverage terabytes of data or telemetry which are being produced by different user facing services and then combining that with machine learning algorithms and providing insights which are very valuable to the end users of the Sankie platform.

    Sridhar: Now, is Sankie available to the world outside of Microsoft?

    Chetan: As part of Sankie, one of the key focus has been on making sure that all of our techniques and algorithms are published in major software and system conferences, so we have published research papers and articles about the Sankie platform, architecture and even the 8 different recommenders which Chandra talked about.

    Sridhar: Ok. So, if it’s all available in the public domain, I think we will make them available along with the transcript of this podcast. Ok, let’s do a little bit crystal ball gazing now. Where do you guys see software development, engineering and DevOps evolving in the future?

    Chandra: I think that’s a great question. As Marc Andreessen famously said, “Software is eating the world”. So, lot of traditional companies are becoming more and more tech companies. You can see that in every industry- automobile, pharmaceutical, retail, everywhere tech is penetrating a lot. This actually makes software development more complex and we need to react to customer requests in more faster ways which basically makes AIOps much more relevant using all the AI/ML technologies to make the entire software development life cycle more efficient and deliver value to the customers and users who are subscribing to our services is going to become way more important.

    Chetan: To add to what Chandra just said, I think there are two things that makes me excited about how we can evolve Sankie and other similar projects to prepare for the next shift in software development industry. So, I think, first is the more and more usage of software and machine learning in cyber physical systems, for example in self driving cars, in agriculture, and these are systems which are safety critical, time critical, and impact humans in a big way. So evolving Sankie and other similar tools and techniques, for those sets of those verticals of software and services I think will be a key challenge and opportunity. And the last one is the move from software industry has seen the software 1.0, 2.0 and now this move to the edge, right, where lot of times, the cloud or the computers available on the edge of the network so that is accessible or located close to the user, so how we can leverage Sankie and other similar techniques for the edge focused cloud is another interesting aspect which we are excited about.

    Sridhar: Ok. So, Chandra and Chetan, this has been a fantastic conversation and fascinating. And thank you so much once again for your time.

    Chandra: Thanks Sridhar.

    Chetan: Thanks, Sridhar, for this insightful conversation. Thank you.

  • Microsoft Research India Podcast – Podcast (15)
    Podcast: What 'bhasha' do you want to talk in? With Kalika Bali and Dr. Monojit Choudhury1 jun 2020· Microsoft Research India Podcast

    Episode 003 | June 02, 2020

    Many of us who speak multiple languages switch seamlessly between them in conversations and even mix multiple languages in one sentence. For us humans, this is something we do naturally, but it’s a nightmare for computing systems to understand mixed languages. On this podcast with Kalika Bali and Dr. Monojit Choudhury, we discuss codemixing and the challenges it poses, what makes codemixing so natural to people, some insights into the future of human-computer interaction and more.

    Kalika Bali is a Principal Researcher at Microsoft Research India working broadly in the area of Speech and Language Technology especially in the use of linguistic models for building technology that offers a more natural Human-Computer as well as Computer-Mediated interactions, and technology for Low Resource Languages. She has studied linguistics and acoustic phonetics at JNU, New Delhi and the University of York, UK and believes that local language technology especially with speech interfaces, can help millions of people gain entry into a world that is till now almost inaccessible to them.

    Dr. Monojit Choudhury is a Principal Researcher in Microsoft Research Lab India since 2007. His research spans many areas of Artificial Intelligence, cognitive science and linguistics. In particular, Dr. Choudhury has been working on technologies for low resource languages, code-switching (mixing of multiple languages in a single conversation), computational sociolinguistics and conversational AI. He has more than 100 publications in international conferences and refereed journals. Dr. Choudhury is an adjunct faculty at International Institute of Technology Hyderabad and Ashoka University. He also organizes the Panini Linguistics Olympiad for high school children in India and is the founding chair of the Asia-Pacific Linguistics Olympiad. Dr. Choudhury holds a B.Tech and PhD degree in Computer Science and Engineering from IIT Kharagpur.

    Related

    Microsoft Research India Podcast: More podcasts from MSR IndiaiTunes: Subscribe and listen to new podcasts on iTunesAndroidRSS FeedSpotifyGoogle PodcastsEmailTranscript

    Monojit Choudhury: It is quite fascinating that when people become really familiar with a technology, and search engine is an excellent example of such a technology, people really don’t think of it as technology, people think of it as a fellow human and they try to interact with the technology as they would have done in natural circumstances with a fellow human.

    [Music plays]

    Host: Welcome to the Microsoft Research India podcast, where we explore cutting-edge research that’s impacting technology and society. I’m your host, Sridhar Vedantham.

    [Music plays]

    Host: Many of us who speak multiple languages switch seamlessly between them in conversations and even mix multiple languages in one sentence. For us humans, this is something we do naturally, but it’s a nightmare for computing systems to understand mixed languages. On this podcast with Kalika Bali and Monojit Choudhury, we discuss codemixing and the challenges it poses, what makes codemixing so natural to people, some insights into the future of human-computer interaction and more.

    [Music plays]

    Host: Kalika and Monojit, welcome to the podcast. And thank you so much. I know we’ve had trouble getting this thing together given the COVID-19 situation, we’re all in different spots. So, thank you so much for the effort and the time.

    Monojit: Thank you, Sridhar.

    Kalika: Thank you.

    Host: Ok, so, to kick this off, let me ask this question. How did the two of you get into linguistics? It’s a subject that interests me a lot because I just naturally like languages and I find the evolution of languages and anything to do with linguistics quite fascinating. How was it that both of you got into this field?

    Monojit: So, meri kahani mein twist hai (In Hindi- “there is a twist in my story”). I was in school, quite a geeky kind of a kid and my interests were the usual Mathematics, Science, Physics and I wanted to be a scientist or an engineer and so on. And, I did study language, so I know English and Hindi which I studied in school. Bangla is my mother tongue, so, of course I know. And I also studied Sanskrit in great detail, and I was interested in the grammar of these languages. Literature was not something which would pull me, but language was still in the backbench right, what I really loved was Science and Mathematics. And naturally I ended up in IIT, I studied in IIT Kharagpur for 4 years doing Computer Science, and everything was lovely. And then one day there was a project when we were in final year where my supervisor was working on what is called a text to speech system. So, in this system, it takes a Hindi text and the system would automatically speak it out and there was a slight problem that he was facing. And he asked me if I could solve that problem. I was in my final year- undergrad year at that time. And the problem was how to pronounce Hindi words correctly. At that time, it sounded like a very simple problem, because in Hindi the way we write is the way we pronounce unlike English, where you know, you have to really learn the pronunciations. And turns out, it isn’t. If you think of the words, ‘Dhadkane’ and ‘Dhadakne’, you pretty much write them in exactly the same way, but one you pronounce as ‘Dhadkane’ and the other one is pronounced as ‘Dhadakne’. So, this was the issue. So, my friend, of course, who was also working with me was all for machine learning. And I was saying, there must be a pattern here and I went through lots and lots of examples myself and turned out that there is this very short, simple, elegant rule which can explain most of Hindi words- the pronunciation of those words perfectly. So, I was excited. I went to my professor, showed him the thing, he was saying, “Oh! This is fantastic!”, let’s write a paper and we got a paper and all this was great. But then, somebody, when I was presenting the paper said, “Hey, you know what the problem you solved!” It’s called ‘schwa deletion’ in Hindi. Of course, I wasn’t in linguistics, neither my professor was, so he had no clue what was ‘schwa’ and what was ‘schwa deletion’. I dug a little deeper and found out that people had written entire books on ‘schwa deletion’. And, actually what I really found out was in line with what people had done their research on. And this got me really excited about linguistics. And more interestingly, you know, what I saw is, like you said, language evolution, if you think of why this is there. So, Hindi uses exactly the same style of writing that we use for Sanskrit. But in Sanskrit, there is no ‘schwa deletion’. But if you look at all the modern Indian languages which came from Sanskrit, like Hindi, Bengali or Oriya, they have different degree of pronunciation different from Sanskrit. I am not getting into the detail of what exactly is ‘schwa deletion’, that’s besides the point. But the pronunciations evolve from the original language. The question I then eventually got interested in is, how this happens and why this happens. And then I ended up doing a Ph.D. with the same professor on, language evolution and how sound change happens across languages. And of course, being a computer scientist, I tried modelling all these things computationally. And then there was no looking back, I went, more and more deeper into language, linguistics and natural language processing.

    Host: That's fascinating. And I know for sure that Kalika has got an equally interesting story, right? Kalika, you have a undergrad degree in chemistry?

    Kalika: I do.

    Host: Linguistics doesn’t seem very much like a natural career progression from there.

    Kalika: Yes, it doesn’t. But before I start my story, I have one more interesting thing to say. When Monojit was presenting his ‘schwa deletion’ paper, I was in the audience. I was working somewhere else and I looked at my colleague at that time and said, “We should get this guy to come and work with us.” So, I actually was there when he was presenting that particular ‘schwa deletion’ paper. So, yes, I was a Science student, I was studying Chemistry, and after Chemistry, the thing in my family was everybody goes for higher studies, I rebelled. I was one of those difficult children that we now are very unhappy about. But I said that I didn’t want to study anymore. I definitely didn’t want to do Chemistry and I was going to be a journalist, like my dad. I had already got a job to work in a newspaper. And I went to the Jawaharlal Nehru University to pick up a form for my younger sister. And I looked at the university and said, “This is a nice place, I want to study here.” And then I looked at the prospectus, kind of flicked through it and said, “what’s interesting?”. And I looked at this thing called Linguistics, and it seemed very fascinating. I had no idea what linguistics was about. And then, there was also ancient history which I did know what it was about and it seemed interesting. So, I filled in forms and sat for the entrance exam, after having read like a thin, layman’s guide to linguistics I borrowed from the British Council Library. And I got through. And the interesting thing is that the linguistic entrance exam was in the morning, the ancient history exam was in the afternoon. This was peak summer in Delhi. There were no fans in the place where the exam was being held. So, after taking the linguistic exam, I thought I can’t sit for another exam in this heat and I left. So, I only took the linguistic exam. I got through, no one was more surprised than I was. And I saw it as a sign that I should be going. So, I started a course without having any idea what linguistics was and completely fell in love with the subject within the first month. And coming from a science background, I was very naturally attracted towards phonetics, which I think is, to really understand phonetics and speech science part of linguistics, you do need to have a lot of understanding of how waves worked- the physics of sound. So, that part came a little naturally to me and I was attracted towards speech and the rest as they say is history. So, I went from there, basically.

    Host: Nice. So, chemistry’s loss is linguistics gain.

    Kalika: Yeah, my gain as well.

    Host: Ok, so, I’ve heard you and Monojit talk at length and many times about this thing called codemixing. What exactly is codemixing?

    Kalika: So, codemixing is when people in a multi-lingual community switch back and forth between two or more languages. And you know, as we all, all of us here come from multi-lingual communities where at a community level, not at an individual level, all of us speak more than one language, two, three, four. It’s very natural for us to keep switching between these languages in a normal conversation. So, right now of course, we are sticking to English, but if this was, say, in a different setting, we would probably be switching between Hindi, Bengali and English because these are three languages, all three of us understand, right.

    Host: That’s true.

    Kalika: That’s what code switching is, when we mix languages that we know, when we talk to each other, interact with each other.

    Host: And how prevalent it is?

    Kalika:“Abhi bhi kar sakte hain” (in Hindi- “we can even do it now”). We can still switch between languages.

    Monojit: Yeah.

    Host: “Korte pari” (In Bangla- “we can do that”). Yeah, Monojit, were you saying something when I interrupted you?

    Monojit: You asked how prevalent it is. So, actually, linguists have observed that in all multi-lingual societies where people know multiple languages at a societal level, they codemix. But there is no quantitative data for how much mixing is there and one of the first things we tried to do when we started this project was to do some measurement and see how much mixing does really happen. We looked at social media where people usually talk the way they talk in their real life. I mean they type it, but it’s almost like speech. So we studied English-Hindi mixing in India and some of the interesting things we found is, if you look at public forums on Facebook in India and if you look at sufficiently long threads, let’s say 50 or more comments, then all of them are multi-lingual. You will find at least two comments in two different languages. And sometimes there will be many many languages, right, not only two languages. And interestingly, if you look at each comment, and try to measure how many of them are mixed within itself, like a single comment has multiple languages, it’s as high as 17%. Then, we extended this study to Twitter and now for seven European languages including English, French, Italian, Spanish, Portuguese, German, Turkish. And we studied how much codemixing was happening there. Again, interestingly, 3.5% of the tweets from, I would say the western hemisphere is codemixed. I would guess from South Asia, the number would be very high, we already said 17% for India itself. But then, what’s interesting is, if you look at specific cities, the amount of codemixing also varies a lot. So, in our study we found Istanbul has the largest amount of codemixed tweets, as high as 13%. Whereas some of the cities in the US, let’s say Houston, or cities in southern United States where we know that there is a huge number of English-Spanish bilinguals, even then we see around 1% of codemixing. So, yes, it’s all over the world and it’s very prevalent.

    Kalika: Yeah, and I would like to add that there is this mistaken belief that people codemix because they are not proficient in one language, you know, people switch to their so called native language or mother tongue when they are talking in English because they don’t know English well enough or they can’t think of the English word when they are talking in English and therefore they switch to, say Hindi or Spanish or some other language. But that actually is not true. For people to be able to fluently switch between the two languages and fluently codemix and code switch, they actually have to know both the languages really well. Otherwise, it’s not mixing or code switching, it is just borrowing… borrowing from one language to another.

    Host: Right. So, familiarity with multiple languages basically gets you codemixing, whereas if you are forced to do it, that’s not codemixing. Codemixing is more intentful and purposeful is what you are saying.

    Kalika: Exactly.

    Host: Ok. Do you see any particular situations or environments in which codemixing seems to be more prevalent than not?

    Kalika: Yeah, absolutely. So, in the more formal scenarios, we definitely tend to stick to one language and if you think about it, even if you are a mono-lingual, when you are talking in a formal setting, you kind of have a very structured and have a very different kind of language used than when you are speaking in an informal scenario. But as far as codemixing is concerned, over the years when linguists actually started looking into this, you know some of the first papers that are published on code switching are from 1940’s. And at that time, it was definitely viewed as an informal use of language, but as our language use over the decades has become… you know informal has become much more acceptable in various scenarios. We’ve kind of also started codemixing in a lot of scenarios. So earlier if you’ve thought about it, if you looked at television, people stuck to just one language at a time. So, of it was a Hindi broadcast, it was just Hindi, if it was an English broadcast, it was just English. But now, television, radio, they all switch between English and multiple Indian languages when they are broadcasting. So, though it is like a much more informal scenario… use-case, now it’s much more prevalent in various scenarios.

    Monojit: And to add to that, there is a recent study which says that there is all the signs that Hinglish- mixing of Hindi and English- is altogether a new language rather than mixing. Because there are children who grow up with that as their mother tongue. So, they hear Hinglish being spoken or in other words codemixing between these two languages happening all the time in their family, by their parents and other in their family and they take that as the language or the native language they learn. So, it’s quite interesting like on one extreme like Kalika earlier mentioned, there are words which are borrowed, so you just borrow them to fill a gap which is not there in your language, or you can’t remember, whatever the reason might be. On the other extreme, you have two languages that are fused to give a new language. So, these are called fused-lects like Hinglish. I would leave it to you to decide whether you consider it as a language or not. But definitely there are movies which are entirely in Hinglish or ads which are in Hinglish, you can’t say it’s either Hindi or English. And in between, of course there is a spectrum of different levels and level of integration of mixing between the languages

    Host: This is fascinating. You are saying something like Hinglish, kind of becomes a language that’s natural rather than being synthetic.

    Kalika: Yes.

    Monojit: Yes.

    Host: Wow! Ok.

    Kalika: I mean, if you think of a mother tongue as the language that you dream in and then ask yourself what is the language that you dream in- I dream in Hinglish, so that’s my mother tongue.

    [Music plays]

    Host: How does codemixing come into play or how does it impact the interaction that people have with computers or computing systems and so on?

    Monojit: So, you know, there is again another misconception which is, in the beginning we said that when people codemix, they know both the languages equally well. So, the misconception is if I know both Hindi and English and my system, let’s say a search engine or a speech recognition or a chat bot system, understands only one of the languages, let’s say English, then I will not use the other language or I will not mix the two languages. But we have seen that this is not true. In fact, long time ago, when I say long time I mean, let’s say ten years ago, when there was no research in computational processing of codemixing and there were no systems which could handle codemixing, even at that time, we saw that people issued a lot of queries to Bing which were codemixed. My favorite example is this one – “2017 mein, scorpio rashi ka career ka phal” in Hindi. So, this is the actual query. And everything is typed in the Roman script. Now, it has mixed languages, it has mixed scripts and everything. So It is quite fascinating that when people become really familiar with a technology, and search engine is an excellent example of such a technology, people really don’t think of it as technology, people think of it as a fellow human and they try to interact with the technology as they would have done in natural circumstances with a fellow human.. And that’s why even though we designed chat bots or ASR (automatic speech recognition) systems, thinking of one particular language in mind, but when we deploy them, we see everybody is mixing languages actually, even without realizing that they are mixing languages. So in that sense all technologies that we build which are user facing or any technology that is actually analyzing data which is user generated ideally should have the capability to process codemixed input.

    Host: So, you used the word ideally which obviously means that it’s not necessarily happening may be too often or as much as it should be. So, what are the challenges out here?

    Kalika: Initially, the challenge was to accept that this happens. But now we have crossed that barrier and people do accept that large percentage of this world lives in multi-lingual communities and this is a problem. And if they are to interact naturally with the so-called natural language systems, then they have to use and process codemixing. But I think the biggest challenge is data because most of the technologies… language technologies these days are data hungry. They all are based on machine learning and deep neural network systems and we require a huge amount of data to train these systems. And it’s not possible to get data in the same sense for codemixing as we can for mono-lingual language use, because if you think about it, the variation in code mixing where you can switch from one language to another is very high. So, to be able to get enough examples in your data of all the possible ways in which people can mix two languages is a very, very difficult task. And this has implications for almost all the systems that we might want to look at like machine translations, speech recognition, because all of these ultimately rest on language models and to train these language models we need this data.

    Host: So, are there any ways to address this challenge of data?

    Monojit: So, there are several solutions that we actually thought of. One thing is, asking a fundamental question that “Do we really need a new data set for training codemix systems?”. For instance, imagine a human being who knows two languages, let’s say Hindi and English which the three of us know. And imagine that we have never heard anybody mix these two languages in our life before. A better example might be English and Sanskrit. I really haven’t heard anybody mixing English and Sanskrit. But if somebody does mix these two languages, would I be able to understand? Would I be able to point out- this sounds grammatical and this doesn’t? It turns out that intuitively at least, for human beings, that’s not a problem. We have an intuitive notion of what is mixing and which patterns of mixing are acceptable. And we really don’t need to learn codemix language as a separate language once we know the two languages involved equally well. So, this was the starting point for most of our research. So then, we thought, how best- instead of creating data in codemixed language- can we start with mono-lingual data sets or mono-lingual models and from there somehow combine them to build codemixed models? Now there are several approaches that we took and they worked to various degrees. But the most interesting one which I would like to share is based on some linguistic theories. Now, these linguistic theories, says that certain, I mean given the grammar of the two languages, so if you have the grammar of English and let’s say Hindi and depending on how these grammars are, there are only certain ways in which mixing is acceptable. And to give an example, let’s say, I can say, “I do research on codemixing”. Now, for this, I can codemix and say… let’s say, “Main codemixing pe research karta hoon”. It sounds perfectly normal. “I do shodh karya on codemixing”- we don’t use it that often. Probably we wouldn’t have heard, but you still might find it quite grammatical. But if I say, “Main do codemixing par shodh karya”, does it sound natural to you? Now, there is something which doesn’t sound right, and linguists have theories on why this doesn’t sound right. And, starting from those theories we build models which can take data in two languages… parallel data or if you have a translator, then you can actually translate a sentence, let’s say, “I do research in codemixing.” And you use a English-Hindi translator and translate it into Hindi: “Main codemixing (I don’t know what the Hindi for codemixing is) par shodh karya karta hoon”. And then given these two sentences… this pair of parallel sentences, there is a systematic process in which you can generate all the possible ways in which these two sentences can be mixed in a grammatically valid way, when you are saying Hinglish. Now, we built those models, the linguistic theories were more theories, so we had to build… we had to flesh them out and build real systems which could generate this. Now, once we have that, now you can imagine that there is no dearth of data. You can take any data in a mono-lingual… in a single language… any English sentence and convert it into codemixed Hindi versions. And then you have lot of data. And then whatever you could do for English, you can now train the same system on this artificially created data and you can solve those tasks. So that was the basic idea using which we could solve a lot of different problems starting from translation to part of speech tagging, to sentiment analysis to parsing.

    Host: So, what you are saying is that given that you need a huge amount of data to build… build out models, but the data is not available, you just create the data yourself.

    Monojit: Right.

    Host: Wow.

    Kalika: Yes, based on certain linguistic theoretical models which we have made into computational linguistic theoretical models.

    Host: Ok, so, we’ve then talking about codemixing as far as textual data is concerned for the large part. Now, are you doing something as far as speech is concerned?

    Kalika: Yes, speech is slightly more difficult than pure text, primarily because there you have to kind of look at both the acoustic models as well as the language models. But our colleague Sunayana Sitaram, she’s been working now for almost three years on codemixed automatic speech recognition system and she had… she had actually come up with this really interesting Hindi-English ASR system which mixed between Hindi and English and… was able to recognize a person speaking in mixed Hindi-English speech.

    Host: Interesting. And where do you see the application of all the work that you guys have done? I mean, I know you have been working on this stuff for a while now, right?

    Kalika: If you think about opinion mining as one of the things and you are looking at a lot of user generated data. The user generated data is a mix between say, English and Spanish and your system can only process and understand English. It can’t understand either the Spanish part or the mixed part, like both English and Spanish together, then, the chances are that you will only get a very skewed and most probably incorrect view of what the user is saying or what the user’s opinion is. And therefore, any analysis you do on top of that data is going to be incorrect. I think Monojit has a very good example of that in the work that you know we did on sentiment and codemixing on Twitter and he looked at how negative sentiment was expressed on Twitter.

    Monojit: Yeah. That’s actually pretty interesting. So this brings us to the question of why people codemix? We said in the beginning that first it’s not random and second it has… it seems to have a purpose. So what is that purpose? Of course, there are lots of theories or observations from linguists starting from humor, sarcasm or even when you are reporting a speech. All these have various degrees of codemixing and there are reasons for this. So, we thought- there is a lot of codemixing on social media, so, we could do a systematic and quantitative study of the different features which make people switch people from Hindi to English or vice-versa. We formulated a whole bunch of hypotheses to test based on the current linguistic theories. So our first hypothesis was that people might be switching from English to Hindi when they are moving from facts to opinions. Because it’s a well-known thing that when you are talking of facts, you can speak it in any language and more likely to be in English in Indian context. Whereas when you are expressing something emotional or an opinion, you are more likely to move… switch to your native language. So people might be more likely switching to Hindi. So, we tried to test all these hypotheses and nothing actually was statistically significant. So, we didn’t see strong signals for that in the data. But then what we saw a really strong signal is when people are expressing negative sentiment they are more likely- actually nine times more likely- to use Hindi than when they are expressing positive sentiment. It seems like English is the preferred language for expressing positive sentiment whereas Hindi is the preferred language for expressing negative sentiment. And we wrote a paper based on these findings that we might praise you in English but gaali to Hindi mein hi denge (In Hindi- we will swear only in Hindi). So, if you did only sentiment analysis in one language, let’s say English and try to do trend analysis of some Indian political issue based on that. It is very likely that you will get a much rosier picture because if you do only English, people would have said more positive things. And the Hindi, I mean, all the gaalis (cuss words) or negative things will actually be in Hindi which you will be missing out. So ideally you should do a processing of all the languages when you are looking at a multi-lingual society and analyzing content from there.

    Kalika: Yeah. And this actually touches a lot on why people codemix and that’s a very vast area of research. Because people codemix for a lot of reasons. People might codemix because they want to be sarcastic, people might codemix because they want to express in group… the three of us will… can move to Bengali to kind of bond and show that we are part of this group that knows Bengali. Or, you meet somebody and they want to keep you at a distance, and not talk to you in that language or mix. So people do it for humor, people do it for reinforcement, there’s a lot of reasons why people codemix and if we miss out on all that it’s very hard for us to make any claims… any firm claims on why people are saying what they are saying.

    Host: It seems like this is an extremely complex area of research which spans not just the computer science or linguistics but also affects sentiment, opinion, etc., a whole lot of stuff going here.

    Monojit: Yeah, and in fact most of the computational linguistics work that you’d see mostly draws from linguistics starting from, you know, how grammar works, syntax and may be how meaning works, semantics. But codemixing goes much beyond that. So, we are talking now of what is called pragmatics and sociolinguistics-. So, pragmatics would be, given a particular context or situation, how language is used there. And modelling pragmatics is insanely difficult. Because you not only need to know the language but you need to know the speakers, the relationship between the speakers, what is the context in which the speakers are situated and speaking and all this information. So, for instance I mean, typically example is if I tell you, “Could you please pass the water bottle?”. Now actually it is a question and you could say, “Yes, I can.”. But that’s not what will satisfy me, right, it’s actually a request. So, that’s how we use language and what we say is not necessarily what we mean. And this intent- understanding this hidden intent is very situational. And in different situation, the same sentence might mean very different things. And codemixing is actually at the boundaries of syntax, semantics and pragmatics. And sociolinguistics is the study of how language is used in society, especially how social variables corelate with linguistic variables. So social variables could be somebody’s level of education, somebody’s age, somebody’s gender, where somebody is from etc. And linguistic variables are whether it’s codemixed or not, at what degree of codemixing, just to give some examples. And we do see some very strong social factors which determine codemixing behavior. In fact, that’s used a lot in our Hindi movies, Bollywood. So, we did a study on Bollywood scripts, so we studied some 37 or 40 Hindi movie scripts which are freely available for research online to see where does codemixing happen in Bollywood. And what we found is codemixing is employed in a very sophisticated way by the script writers in two particular situations. One is, if they want to show a sophisticated urban crowd, as opposed to a rural crowd. So if you look at movies like “Dum Lagake Haisha” which are set either in a small town or in a rural scenario or in the past. Usually those movies will have lot less codemixing. Then, let’s say “Kapoor & Sons” or “Pink” which are set in typically in a city and people are all educated, urban people, so, just to show that codemixing is used heavily in these kinds of movies. And another case where in Bollywood they use a lot of codemixing, in fact accented codemixing, is when you want to show that somebody has been to “foreign” as we would say- abroad- and would come back to India and interact with poor country cousins. So, it’s used a lot in different ways in the movies. And that’s the sociolinguistics bit which is kicking in.

    Kalika: And you know to add to that, what we had touched upon earlier how this usage has kind of changed over time. In the earlier Bollywood movies, this mixing was much less. Not only that, the use of English was mostly used to denote who is the villain in the movie. The evil guys were usually the ones who spoke… if you look at 1970’s or 60’s movies, it’s always the smugglers, the kingpins of the mafia who spoke a lot of English and mixed English into Hindi. So obviously that kind of change has happened over years even in Bollywood movies.

    Host: I would never have thought about all these things. Villains speaking English, ok, in Bollywood!

    [Music plays]

    Host: Where do you see this area of research going in the future? Do you guys have anything in particular or you are just exploring to see ?

    Kalika: I think one of the things we have been looking at a lot is that how when AI interacts with users, with humans, this human-AI interaction scenario, where does codemixing fit in because there is one aspect that the user is mixing and you understand but does the bot or the AI agent also have to mix or not. And if the AI agent has to mix, then where and when should it mix? So, that’s something that we have been looking at and that is something that we think that is going to play an important role in human-AI interaction going forward. We’ve studied this in some detail and it’s actually very interesting- people have a whole variety of attitudes towards not only codemixing but also towards AI bots interacting with them. And this kind of reflects on what they feel about a bot that will codemix and talk to them in a mixture of language irrespective of whether they themselves codemix or not. And our study has shown that some people would look at a bot which codemixes as ‘cool’… and in a very positive way but some people would look at it very negatively. And the reason for that is some people might think that codemixing is not the right thing to do, it’s not a pure language. Other people would think that it’s a bot, it should talk in a very “proper” way, so it should only talk in English or only talk in Hindi and it shouldn’t be mixing language. And a certain set of people are kind of freaked out by the fact that the AI is trying to sound more human like when it mixes. So, there is a wide range of attitudes that people have towards a codemixing AI agent. And how can we kind of tackle that? How do we make a bot then, that codemixes or doesn’t codemix and it please the entire crowd, right?

    Host: Is there such a thing like pleasing the entire crowd?

    Kalika: So, we have ideas about that. How to go about, trying to at least please the crowd.

    Monojit: Yeah. Basically, you have to adapt to the speaker. Essentially the way we please the crowd is through accommodation. So, when we talk to somebody who is primarily speaking in English, I will try to reciprocate in English. Whereas if somebody is speaking in Hindi, I will try to speak in Hindi if I want to please that person. Of course, if I don’t, then I will use the other language to show the social distance. And this is one of the ways which we call the ‘Linguistic Accommodation Theory’. There are many other ways or in general there are various style components that we incorporate in our day to day conversation, mostly unknowingly, based on whether we want to please the other person or not. So, call it sycophancy or whatever, but we want to build bots which kind of model that kind of an attitude. And if we are successful, then the bot will be a world pleaser.

    Kalika: I don’t think it has so much to do with sycophancy- human beings actually have to cooperate and that’s in a sense hardwired to a certain extent into our spine now. For evolutionary reasons, we do need to cooperate and to be able to have a successful interaction, we have to cooperate, and one of the ways we do this is by trying to be more like the person we are talking to and both parties kind of converge to a middle ground and that’s what accommodation is all about.

    Host: So, Kalika and Monojit, this has been a very interesting conversation. Are there any final thoughts you’d like to leave with the listeners?

    Kalika: I hope people get an idea through our work on codemixing that human communication is quite intricate. There are many factors that come into play when human beings communicate with each other. There can be social contexts, there can be pragmatic contexts and of course, the structure of the language and the meaning that you are trying to convey, all of it plays a big role in how we communicate. And by studying codemixing in this context, we are able to hopefully grapple with a lot of these factors which in a very general human-human communication become too big to handle all at once.

    Monojit: Yeah. Language is an extremely complicated and multi-dimensional thing, so, codemixing is just one of the dimensions where we are talking of switching between languages, but then even within languages there are words, there are structural differences between languages, sometimes you can use features of another language in your own language. It won’t be called codemixing, but essentially you are mixing. For instance, accents, when you talk your own native language in, let’s say another kind of an accent borrowed from another language. In Indian English we use things like “little-little”… “those little-little things that we say”. Now “little-little” is not really an English construct, this is a Hindi or Indian language construct which we are borrowing into English. So, all this studying at once would be extremely difficult. But on the other hand, codemixing does provide us with a handle into this problem of computational modeling of pragmatics and sociolinguistics and all those concepts and how we can then not only model these things for the sake of modeling, but they are concrete use-cases… not only use-cases, they are needs. Users are already codemixing through technology. So technology should respond back by understanding codemixing and if possible even generating codemixing. So, through this entire research we are trying to close this loop of how linguistic theories can be used to build computational models and these computational models can then be taken to users and in all its complications and complexities and then we understand and learn from the user technology interaction and feed back to our model. So, this entire cycle of theory to application to deployment is what we would like to do or get deeper insight into in the context of natural language processing.

    Host: And I am looking forward to doing another podcast once you guys have gone down the road with your research on that. Kalika and Monojit, this was a very interesting conversation. Shukriya (In Hindi/ Urdu- thank you).

    Kalika: Aapka bhi bahut bahut thank you (In Hindi- many thanks to you too). It was great fun.

    Monojit: Thank you, Sridhar. Khoob enjoy korlam ei conversationta tomar shaathe. Aar ami ekta kotha (In Bangla- “I very much enjoyed this conversation with you, Sridhar. There’s one thing) I want to tell to the audience: Never feel apologetic anytime when you codemix. This is all very natural and don’t think you are talking an impure language. Thank you.

    Host: Perfect.

    [Music plays]

  • Microsoft Research India Podcast – Podcast (16)
    Enabling Rural Communities to Participate in Crowdsourcing, with Dr. Vivek Seshadri20 mrt 2020· Microsoft Research India Podcast

    Episode 002 | March 20, 2020

    Enabling Rural Communities to Participate in Crowdsourcing, with Dr. Vivek Seshadri

    Crowdsourcing platforms and the gig economy have been around for a while. But are they equally accessible to all communities? Dr. Vivek Seshadri, a researcher at Microsoft Research India, doesn’t think so, and is trying to change this. On this podcast, Vivek talks about what motivated him to focus on research that can help underserved communities, and in particular, about Project Karya, a new platform to provide digital work to rural communities. The word “Karya” literally means “work” in a number of India languages.

    Vivek primarily works with the Technology for Emerging Markets group at Microsoft Research India. He received his bachelor's degree in Computer Science from IIT Madras, and a Ph.D. in Computer Science from Carnegie Mellon University where he worked on problems related to Computer Architecture and Systems. After his Ph.D., Vivek decided to work on problems that directly impact people, particularly in developing economies like India.

    Related

    · Microsoft Research India Podcast: More podcasts from MSR India

    · iTunes: Subscribe and listen to new podcasts on iTunes

    · Android

    · RSS Feed

    · Spotify

    · Google Podcasts

    · Email

    Transcript

    Vivek Seshadri: If you look at crowdsourcing platforms today, there are a number of challenges that actually prevent them from being accessible to people from rural communities. The first one is, most of these platforms contain tasks only in English. And all their task descriptions, everything, is in English which is completely inaccessible to rural communities. Secondly, if you go to rural India today, the notion of digital work is completely alien to them. And finally, there is a logistical challenge here. Most crowdsourcing platforms will assume that the end-user has a computer and constant access to internet. This is actually a luxury in many rural communities in India even today.

    (Music plays)

    Host: Welcome to the Microsoft Research India podcast, where we explore cutting-edge research that’s impacting technology and society. I’m your host, Sridhar Vedantham.

    Crowdsourcing platforms and the gig economy have been around for a while. But are they equally accessible to all communities? Dr. Vivek Seshadri, a researcher at Microsoft Research India, doesn’t think so, and is trying to change this. On this podcast, Vivek talks about what motivated him to focus on research that can help underserved communities, and in particular, about Project Karya, a new platform to provide digital work to rural communities. The word “Karya” literally means “work” in a number of India languages.

    Vivek primarily works with the Technology for Emerging Markets group at Microsoft Research India. He received his bachelor's degree in Computer Science from IIT Madras, and a Ph.D. in Computer Science from Carnegie Mellon University where he worked on problems related to Computer Architecture and Systems. After his Ph.D., Vivek decided to work on problems that directly impact people, particularly in developing economies like India.

    (Music plays)

    HOST: Vivek, welcome to the podcast.

    Vivek: Thanks, Sridhar. This is the first time I am doing anything like this, so I am really excited and a little bit nervous.

    Host: Oh, I don't think there's anything to be nervous about really here. You guys are used to speaking in public all the time. So, I'm sure it'll be fine.

    Vivek, you are a computer scientist and you did your PhD in Computer Science in Systems, right? What made you gravitate towards research that helps underserved communities, typically the kind of research that one associates with the ICTD space?

    Vivek: So, Sridhar, when I finished my PhD in 2016, I sort of had two decisions to make- should I stay in the US or should I move back to India? Should I stay in the same area that I am doing research in or should I move to a different field? Both these questions were sort of answered when I visited MSR and had interactions with people like Bill Thies. The kind of research that they were doing impressed me and also influenced me to make the decision to come back to India and work on similar problems that directly impact people.

    Host: That's interesting. So this is something that was brought upon by meeting people in the lab here rather than something that was there in your mind all along.

    Vivek: Absolutely. Actually, when I started my PhD, I wanted to come back and become professor in places like IIT or IISc. And when I moved back, I was actually introduced to MSR by one of my friends who actually visited MSR before me. And I just thought I'll pay a visit. And the conversations that I had with people here, sort of made my decision absolutely easy.

    Host: And the rest is history, as they say.

    Vivek: Absolutely. It’s been three years since I moved here and I couldn't be happier.

    Host: Great. So Vivek, walk us through this project called Karya, which I know you have been associated with for quite a while. What exactly is project Karya and what are your goals with that project?

    Vivek: So, there are two trends that enables or motivates the need for a project like Karya. The first trend is that there is a digital revolution in the world today, where improvements in technologies like Machine Learning are allowing people to interact with devices using natural language. The second trend is specific to India where we are trying to push towards a digital future which is creating a lot of tasks like audio transcription, document digitization, etc. Both these trends are going to result in a huge amount of what we call digital work. And the goal for project Karya is to take this digital work and make it accessible to people from rural communities who typically have very low incomes today and are predominantly stuck with physical labor. We believe completing these digital tasks and getting paid for them will be a valuable source of supplemental income for people from rural communities.

    Host: Crowdsourcing and crowdsourcing platforms have been around for quite a while now. And they are also well-established methods of gig work. So what's the need for another approach or a different framework like Karya?

    Vivek: That's a great question. If you look at crowdsourcing platforms today, there are a number of challenges that actually prevent them from being accessible to people from rural communities. Specifically, let me describe to you three challenges. The first one is, most of these platforms contain tasks only in English. And all their task descriptions, everything, is in English which is completely inaccessible to rural communities. Secondly, if you go to rural India today, the notion of digital work is completely alien to them. In fact, when we went to rural communities in our first visit and told them we will actually pay some money for completing some set of digital tasks, they looked at us in disbelief. Like they actually didn't believe that we are going to pay them until we actually did. So, there is this huge issue of awareness. And finally, there is a logistical challenge here. Most crowdsourcing platforms will assume that the end-user has a computer and constant access to internet. This is actually a luxury in many rural communities in India even today.

    Host: So, does Karya enable people to use their existing skillsets and knowledge to earn supplemental or extra income?

    Vivek: So, Sridhar, like I mentioned, there are two sources of digital work that we are looking at currently. One is creating label data sets for models like automatic speech recognition, and other language-based machine learning models. The second source of digital work that we are looking at is things like speech transcription or document digitization, which the government is very extremely interested in. Now depending on what type of task we are going to do, people may have to be able to read in their regional language or type in their regional language. Now, when it comes to reading, we find that most people from rural communities are adept at reading in their regional language. When it comes to typing, as you can imagine there are not many good keyboards that will allow you to type in your local language. This is something that most people in rural communities have never done before. In fact, even though, most people in rural communities are not familiar with English, they actually use a very crude form of transliteration to actually communicate in their regional languages. That's what we observed- most people used WhatsApp and when communicating with each other they actually use transliteration in English and not type in their native language.

    Host: So, you are saying that there is a large number of people who are actually typing in the English script, but the language that they are representing is their own vernacular.

    Vivek: Exactly. And the transliteration is very crude. They know what sounds each English alphabet corresponds to and they just put together a bunch of characters next to each other and it's almost like they have created a whole new script for their local language.

    Host: Right.

    Vivek: But something like that wouldn't actually be useful for us. We would want them to type in their local language. For instance, let’s take an example of document digitization. The idea there is, the government has a whole of government records which contain hand-written words in their local language. It could be names of people, it could be addresses, etc. When I want to digitize these documents, I may actually want someone to type out the names that they see in the document in the local language. Now, there, I would actually want them to use the native script. And not, some crude form of transliteration.

    Host: Sure.

    Vivek: So, in this particular case, we actually used a keyboard that was developed by IIT Bombay called Swarachakra. And our users actually learnt to use that keyboard within a very short span of time and they were able to perform extremely well in the task that we had assigned them.

    Host: So, it sounds like there is a lot of work that is readily available. What is required is to actually deliver it and make it possible for people to leverage that work in order to earn extra income.

    Vivek: Absolutely. Actually, the government of India has its own crowdsourcing platform, where they outsource text digitization like I mentioned to anyone in India who wants to do it. Unfortunately, even that platform is not accessible to rural communities. If I go to rural India and ask anyone about that platform, they wouldn't know anything about it. So, in some sense, there is work that is readily available, but there is this huge gap in access.

    Host: And the gap in access is because these platforms work on their traditional paradigm of needing a desktop computer with an internet connection?

    Vivek: Exactly. In fact, the platform that the Government of India has, it’s a website that you have to access and you need internet connection to receive tasks and complete tasks. And our goal is to sort of eliminate that requirement. In fact, the goal of project Karya is to enable anyone with just a smartphone to be able to perform digital tasks on their phone.

    (Music plays)

    Host: I know you've already conducted some experiments with Karya. And you've also published a paper in Chi in 2019. Can you walk us through some of the results of the experiments that you've conducted?

    Vivek: So, one of the biggest challenges in creating a platform like Karya is the perceived lack of trust in rural labor. When we actually spoke to many potential work providers on whether they would be willing to outsource their work to rural workers, one of the first questions that they ask is if they can trust the quality of labor that we get from rural workers. So, in the Chi paper, what we wanted to sort of evaluate was the accuracy and effectiveness with which workers from rural India can actually complete a specific type of digital task. So, in that particular paper, we actually looked at text digitization, where the task is as simple as the user is shown an image of an hand-written text and all they had to do was type out whatever word they see in the particular image. And of course, they will be given thousands of images that they have to digitize over a period of two weeks. And what we actually found in the paper was that workers from rural India actually did fantastically well. In fact, in a crowdsource setting, they outperformed a professional transcription firm to which we gave the same data set. So, that was very interesting for us.

    Host: That’s really interesting. Do you have any insights into why that might have happened or, how this community of people that you engaged with were able to outperform professional services?

    Vivek: So, with respect to the performance of the transcription firm itself, we could only guess, because it was a black box for us. We just gave them the data set and asked them to provide the results and the results that we got were not that good. But we can definitely guess why workers from rural communities did so well. First of all, the additional income that workers from rural communities are getting out of completing these tasks is significant. So, for them there is actually a fear that they may not get paid if they don’t complete the tasks accurately. So, from that point of view, most users paid extreme attention to completing the tasks accurately. And these workers also found it a lot of fun. Like I mentioned before, most of their current work is typically physical labor, be it farming, many of them are actually unemployed. So, for them, this is actually a fun activity that they can do together with their friends where they also get some money. So, from their point of view, it was both fun and it gave them very very valuable supplemental income. I think both these were significant factors in the rural workers performing really well in the task that we gave them.

    Host: Your Chi paper was based on text digitization by members of rural communities. But have you looked at other types of tasks that can be completed through Karya?

    Vivek: Yes. Actually, as we were working on the platform, we realized that there is

    a real need for speech data sets in various languages in India. In fact, in our very lab, Kalika Bali, who is a researcher, is working on this project called Ellora, whose goal it is, is to create voice technologies for all the languages in India. One of the fundamental bottlenecks in achieving this is labeled speech data sets. A labeled speech data set is essentially a data set that contains various audio recordings, and the transcripts that correspond to those recordings. We actually found a mechanism to use Karya to collect such a data set for various languages. In fact, we have an ongoing study where we are collecting hundreds of hours of speech data for languages for which there is almost no data today.

    Host: So, when you give out these speech collection tasks, what is the actual process, how does it actually work?

    Vivek: So, at the lowest level, the task is essentially for the user to read out, record themselves reading out a sentence. However, to make the task more fun, we actually made them read out stories. Some empowering stories, some stories about history of our country, some stories about popular figures like Buddha, and users really liked reading out stories as opposed to reading out random bits of sentences.

    Host: So, we've been talking about Karya as a project in which we are helping or building a new paradigm in crowdsourcing. What are the actual components that go into Karya as a system?

    Vivek: So, Sridhar, if you look at any crowdsourcing platform that is out there today there are two major components. One is the server that actually contains all the tasks that have to be completed, that is the component that work providers interact with to submit the task that they want to get completed. The second component is actually the client that the workers will use to actually complete the tasks. In a typical crowdsourcing platform where internet connection is assumed, the client will directly talk to the server, get the tasks and the responses are also directly submitted to the server.

    Host: Right.

    Vivek: Now, like I mentioned, most rural communities in India do not have internet connectivity. In fact, two of the three locations that we have worked with have absolutely no connectivity. Which means a platform that assumes internet connectivity is going to exclude those people from participating in the platform and get paid for completing valuable tasks.

    Host: So, how do you bridge that?

    Vivek: So, the way we bridge this gap is by introducing this third component that we are calling a Karya Box. Now, the Karya Box is essentially a device that we will place in the village where we want to work with people. And you can think of the box as a local crowdsourcing server for that particular village.

    Host: Okay.

    Vivek: So, the Karya Box will essentially act as a local crowdsourcing server in the village where we have placed it. Users in the village can directly interact with the box through the Wi-Fi access point that the box will expose. So, anyone with a smartphone can just connect to the Karya box Wi-Fi and then interact with the box to get tasks and submit their responses as well. Now the question is, how does the box communicate with our server?

    Host: Yeah.

    Vivek: So, in most villages which do not have connectivity what we observe is there are definitely people who go to nearby cities for work or even to get digital content that they can get back to the village. What we need to do is to employ someone like that who can carry the box to a location where is internet connectivity, periodically, maybe once a day or even once a week. And at that instant, when the box gets connectivity to the server it can exchange, both the responses that have been submitted already by the rural workers and also get any new tasks for the village, if any.

    Host: That seems to be a smart and inexpensive way to get around the lack of connectivity issue.

    Vivek: Absolutely. Actually, I can tell you a story around this.

    Host: Oh, please, we love stories.

    Vivek: When we did our recent study, we actually deployed the box in the village. That village actually has really good connectivity. So, we were actually expecting the box to be in regular contact with our server. But due to various reasons, there was an internet shutdown in the village for the first one week after we deployed the box. But there you go. Our system actually worked because it does not assume that the box regularly talks to the server.

    Host: I am assuming and correct me if I'm wrong, that a lot of the people who are interacting with the system, the Karya app especially on the phone, right, they'd be doing something of this nature for the first time. How did people typically find working with the Karya app and were there significant hurdles, were there issues in the communities that you went and did your experiments with?

    Vivek: So, like I mentioned before, most people actually found doing this kind of an activity a lot of fun. So, from that point of view, there was not much boredom even though the tasks were extremely repetitive. Now imagine, looking at words screen after screen and typing them out or sentences screen after screen and reading them out. This is probably a very mundane task for people in urban communities. But for people in rural communities, where they don't get to do this kind of thing very often or even interact with a smartphone very often, they actually found it a lot of fun. In fact, many people actually found some sense of pride in actually completing tasks in their local language.

    (Music plays)

    Host: It seems like this kind of digital work has the potential over time to provide people with livelihoods and enhance existing incomes. Do you think there is a potential downside to digital work or a potential downside to the online gig economy?

    Vivek: Definitely, there is a limitation, similar to any other gig economy, like your, cab-hailing services where it's a physical gig work that you are doing, or delivering food, it's again a physical gig economy. As more people join the platform, the amount of work that is going to be available for every individual person is going to go down. So in that sense, one should not think of even the digital gig economy as a sustainable source of livelihood. So from that point of view, one of the limitations is the excitement that workers in rural communities have for such kinds of tasks. These tasks are much easier to complete than the task that they are involved in right now. And they also pay much higher than the task that they are doing right now. So there is definitely the possibility that some of them may think this is a much more lucrative job that will provide a full-time income for them. But we have to warn them in advance, saying, this is not the case.

    Host: So expectation setting is going to be key.

    Vivek: Expectation setting is, in fact, a huge part of what we need to do when we actually scale out the platform. In fact, even for the small studies that we conducted in these villages where studies were for a period of two weeks, during which time people may earn let's say 3000 rupees, their question at the end of the study is, "When are you going to come back?" Right? So, that sort of enthusiasm is both encouraging and scary. Because, if you don't have a sustainable source of work that you can provide to these villagers, it can end up in disappointment.

    Host: Was there anything that surprised you when you were working with and when you were talking to various communities during the experiments with Karya?

    Vivek: Yes. Actually, two things stood out for us. The first thing is, how inclusive the notion of digital work can be when it comes to employing people from diverse backgrounds. What we observed was, women who were typically not allowed to get out of their house in rural communities for various reasons were able to participate on our platform and actually earn income for the first time in their lives. People with physical disabilities were able to participate on our platform.

    Host: That must have felt extremely empowering for them.

    Vivek: Absolutely. And the second thing that we observed is like I mentioned before, this sense of pride that they had when they were completing tasks in their local language. Like I mentioned before, this is not something that they get to do often. In fact, in one of our studies where the task involved was recording themselves reading out stories, many people actually went over and did the tasks all over again, just so that they can read the stories to their kids or to the community. This is something that was completely surprising to us. Now imagine if someone in an urban community would actually be willing to do that.

    Host: Yeah. That's good food for thought.

    So it certainly seems like your experiments with Karya show that it's got a huge amount of promise and potential. Over time, where to you see or where do you hope to see Karya?

    Vivek: So Sridhar, like I mentioned, as language technologies keep improving the need for creating these technologies for various Indian languages is only going to increase. There are going to be many startups which would want data sets for creating the models that they want in local languages. We believe, with our insights and solutions that we have built for creating a crowdsourcing platform

    for rural communities, Karya can be the platform that these organizations, both private startups or even the government, can come to, to get their valuable task competed.

    Host: Vivek, this has been an extremely interesting conversation. Thank you for your time.

    Vivek: Thanks a lot, Sridhar for giving me this opportunity to both talk about the project and also do my first podcast.

    Host: My pleasure.

    To learn more about Dr. Vivek Seshadri, the Technology for Emerging Markets Group, visit Microsoft Research India.

  • Microsoft Research India Podcast – Podcast (17)
    Podcast: Potential and Pitfalls of AI with Dr. Eric Horvitz2 mrt 2020· Microsoft Research India Podcast

    Episode 001 | March 06, 2020

    Dr. Eric Horvitz is a technical fellow at Microsoft, and is director of Microsoft Research Labs, including research centers in Redmond, Washington, Cambridge, Massachusetts, New York, New York, Montreal, Canada, Cambridge, UK, and Bengaluru, India. He is one of the world’s leaders in AI, and a thought leader in the use of AI in the complexity of the real world.

    On this podcast, we talk to Dr. Horvitz about a wide range of topics, including his thought leadership in AI, his study of AI and its influence on society, the potential and pitfalls of AI, and how useful AI can be in a country like India.

    Transcript

    Eric Horvitz: Humans will always want to make connection with humans, sociologists, social workers, physicians, teachers, we’re always going to want to make human connections and have human contacts.

    I think they’ll be amplified in a world of richer automation so much so that even when machines can generate art and write music, even music with lyrics that might put tear in someone’s eye if they didn’t know it was a machine, that will lead us to say, “Is that written by a human. I want to hear a song sung by a human who experienced something, the way I would experience something, not a machine.” And so I think human touch, human experience, human connection will grow even more important in a world of rising automation and those kinds of tasks and abilities will be even more compensated than they are today.

    (music plays)

    Host: Welcome to the Microsoft Research India podcast, where we explore cutting-edge research that’s impacting technology and society. I’m your host, Sridhar Vedantham.

    Host: Our guest today is Dr. Eric Horvitz, Technical Fellow and director of the Microsoft Research Labs. It’s tremendously exciting to have him as the first guest on the MSR India podcast because of his stature as a leader in research and his deep understanding of the technical and societal impact of AI.

    Among the many honors and recognitions Eric has received over the course of his career are the Feigenbaum Prize and the Allen Newell Prize for contributions to AI, and the CHI Academy honor for his work at the intersection of AI and human-computer interaction. He has been elected fellow of the National Academy of Engineering (NAE), the Association of Computing Machinery (ACM) and the Association for the Advancement of AI , where he also served as president. Eric is also a fellow of the American Association for the Advancement of Science (AAAS), the American Academy of Arts and Sciences, and the American Philosophical Society. He has served on advisory committees for the National Science Foundation, National Institutes of Health, President’s Council of Advisors on Science and Technology, DARPA, and the Allen Institute for AI.

    Eric has been deeply involved in studying the influences of AI on people and society, including issues around ethics, law, and safety. He chairs Microsoft’s Aether committee on AI, effects, and ethics in engineering and research. He established the One Hundred Year Study on AI at Stanford University and co-founded the Partnership on AI. Eric received his PhD and MD degrees at Stanford University.

    On this podcast, we talk to Eric about his journey in Microsoft Research, his own research, the potential and pitfalls he sees in AI, how AI can help in countries like India, and much more.

    Host: Eric, welcome to the podcast.

    Eric Horvitz: It’s an honor to be here. I just heard I am the first interviewee for this new series.

    Host: Yes, you are, and we are really excited about that. I can’t think of anyone better to do the first podcast of the series with! There’s something I’ve been curious about for a long time. Researchers at Microsoft Research come with extremely impressive academic credentials. It’s always intrigued me that you have a medical degree and also a degree in computer science. What was the thinking behind this and how does one complement the other in the work that you do?

    Eric Horvitz: One of the deep shared attributes of folks at Microsoft Research and so many of our colleagues doing research in computer science is deep curiosity, and I’ve always been one of these folks that’s said “why” to everything. I’m sure my parents were frustrated with my sequence of whys starting with one question going to another. So I’ve been very curious as an undergraduate. I did deep dives into physics and chemistry. Of course, math to support it all – biology and by the time I was getting ready to go to grad school I really was exploring so many sciences, but the big “why” for me that I could not figure out was the why of human minds, the why of cognition. I just had no intuition as to how the cells, these tangles of the cells that we learn about in biology and neuroscience could have anything to do with my second to second experience as being a human being, and so you know what I have to just spend my graduate years diving into the unknowns about this from the scientific side of things. Of course, many people have provided answers over the centuries- some of the answers are the foundations of religious beliefs of various kinds and religious systems.

    So I decided to go get an MD-PhD, just why not understand humans deeply and human minds as well as the scientific side of nervous systems, but I was still an arc of learning as I hit grad school at Stanford and it was great to be at Stanford because the medical school was right next to the computer science department. You can literally walk over and I found myself sitting in computer science classes, philosophy classes, the philosophy of mind-oriented classes and cognitive psychology classes and so there to the side of that kind of grad school life and MD-PhD program, there are anatomy classes that’s being socialized into the medical school class, but I was delighted by the pursuit of- you might call it the philosophical and computational side of mind- and eventually I made the jump, the leap. I said “You know what, my pursuit is principles, I think that’s the best hope for building insights about what’s going on” and I turned around those principles into real world problems in particular since that was, had a foot in the medical school, how do we apply these systems in time-critical settings to help emergency room, physicians and trauma surgeons? Time critical action where computer systems had to act quickly, but had to really also act precisely when they maybe didn’t have enough time to think all the way and this led me to what I think is an interesting direction which is models of bounded-rationality which I think describes us all.

    Host: Let’s jump into a topic that seems to be on everybody’s mind today – AI. Everyone seems to have a different idea about what AI actually is and what it means to them. I also constantly keep coming across people who use AI and the term ML or machine learning as synonyms. What does AI mean to you and do you think there’s a difference between AI and ML?

    Eric Horvitz: The scientists and engineers that first used the phrase artificial intelligence did so in a beautiful document that’s so well written in terms of the questions it asks that it could be a proposal today to the National Science Foundation, and it would seem modern given that so many the problems have not been solved, but they laid out the vision including the pillars of artificial intelligence.

    This notion of perception building systems that could recognize or perceive sense in the world. This idea of reasoning with logic or other methods to reason about problems, solve problems, learning how can they become better at what they did with experience with other kinds of sources of information and this final notion they focused on as being very much in the realm of human intelligence language, understanding how to manipulate symbols in streams or sequences to express concepts and use of language.

    So, learning has always been an important part of artificial intelligence, it’s one of several pillars of work, it’s grown in importance of late so much so that people often write AI/ML to refer to machine learning but it’s one piece and it’s an always been an important piece of artificial intelligence.

    Host: I think that clarifies the difference between AI and ML. Today, we see AI all around us. What about AI really excites you and what do you think the potential pitfalls of AI could be?

    Eric Horvitz: So let me first say that AI is a constellation of technologies. It’s not a single technology. Although, these days there’s quite a bit of focus on the ability to learn how to predict or move or solve problems via machine learning analyzing large amounts of data which has become available over the last several decades, when it used to be scarce.

    I’m most excited about my initial goals to understand human minds. So, whenever I read it a paper on AI or see a talk or see a new theorem being proved my first reaction is, how does it grow my understanding, how does it help to answer the questions that have been long-standing in my mind about the foundations of human cognition? I don’t often say that to anybody but that’s what I’m thinking.

    Secondly, my sense is what a great endeavor to be pushing your whole life to better understand and comprehend human minds. It’s been a slow slog. However, insights have come about advances and how they relate to those questions but along the way what a fabulous opportunity to apply the latest advances to enhancing the lives of people, to empowering people in new ways and to create new kinds of automation that can lead to new kinds of value, new kinds of experiences for people. The whole notion of augmenting human intellect with machines has been something that’s fascinated me for many decades. So I love the fact that we can now leverage these technologies and apply them even though we’re still very early on in how these ideas relate to what’s going on in our minds.

    Applications include healthcare. There’s so much to do in healthcare with decreasing the cost of medicine while raising the quality of care. This idea of being able to take large amounts of data to build high quality, high precision diagnostic systems. Systems that can predict outcomes. We just created a system recently for example that can detect when a patient in a hospital is going to crash unexpectedly with organ system failures for example, and that can be used in ways that could alert physicians in advanced, medical teams to be ready to actually save patient’s lives.

    Even applications that we’re now seeing in daily life like cars that drive themselves. I drive a Tesla and I’ve been enjoying the experience of the semi-automated driving, the system can do. Just seeing how far we’ve gotten in a few years with systems that recognize patterns like the patterns on a road or that recognize objects in its way for automatic braking. These systems can save thousands of lives. I’m not sure about India but I know the United States statistics and there are a little bit more than 40,000 lives lost on the highways in the United States per year. Looking at the traffic outside here in Bangalore, I’m guessing that India is at least up there with tens of thousands of deaths per year. I believe that that AI systems can reduce these numbers of deaths by helping people to drive better even if it’s just in safety related features.

    Host: The number of fatalities on Indian roads is indeed huge and that’s in fact been one of the motivators for a different research project in the lab on which I hope to do a podcast in the near future.

    Eric Horvitz: I know it’s the HAMS project.

    Host: It is the HAMS project and I’m hoping that we can do a podcast with the researchers on that sometime soon. Now, going back to AI, what do you think we need to look out for or be wary of? People, including industry leaders seem to land on various points on a very broad spectrum ranging from “AI is great for humanity” to “AI is going to overpower and subsume the human race at some point of time.”

    Eric Horvitz: So, what’s interesting to me is that over the last three decades we’ve gone from AI stands for almost implemented, doesn’t really work very well. Have fun, good luck to this idea of just getting things up and running and being so excited there’s no other concerns but to get this thing out the door and have it for example, help physicians diagnose patients more accurately to now, “Wait a minute! We are putting these machines in places that historically have always relied upon human intelligence, as these machines for the first time edge into the realm of human intellects, what are the ethical issues coming to the fore? Are there intrinsic biases in the way data is created or collected, some of which might come from the society’s biases that creates the data? What about the safety issues and the harms that can come from these systems when they make a mistake? When will systems be used in ways that could deny people consequential services like a loan or education because of an unfair decision or a decision that aligns mysteriously or obviously with the way society has worked amplifying deep biases that have come through our history?”

    These are all concerns that many of us are bringing to light and asking for more resources and attention to focus on and also trying to cool the jets of some enthusiasts who want to just blast ahead and apply these technologies without thinking deeply about the implications, I’d say sometimes the rough edges of these technologies. Now, I’m very optimistic that we will find pathways to getting incredible amounts of value out of these systems when properly applied, but we need to watch out for all sorts of possible adverse effects when we take our AI and throw it into the complexity of the open world outside of our clean laboratories.

    Host: You’ve teed-up my next question perfectly. Is it incumbent upon large tech companies who are leading the charge as far as AI is concerned to be responsible for what AI is doing, and the ethics and the fairness and all the stuff behind AI which makes it kind of equitable to people at large?

    Eric Horvitz: It’s a good question. There are different points of view on that question. We’ve heard some company leaders issue policy statements along the lines of “We will produce technologies and make them available and it’s the laws of the country that will help guide how they’re used or regulate what we do. If there are no laws, there’s no reason why we shouldn’t be selling something with a focus on profit to our zeal with technology.”

    Microsoft’s point of view has been that the technology could be created by experts inside its laboratories and by its engineers. Sometimes is getting ahead of where legislation and regulation needs to be and therefore we bear a responsibility as a company in both informing regulatory agencies and the public at large about the potential downsides of technology and appropriate uses and misuses, as well as look carefully at what we do when we actually ship our products or make a cloud service available or build something for a customer.

    Host: Eric, I know that you personally are deeply involved in thinking through AI and it’s impact on society, how to make it fair, how make it transparent and so on. Could you talk a little bit about that, especially in the context of what Microsoft is doing to ensure that AI is actually good for everybody?

    Eric Horvitz: You know, these are why this is such a passion for me – I’ve been extremely interested starting with the technical issues which I thought- I think- really deep and fascinating, which is when you build a limited system by definition that’s much simpler than a complex universe that’s going to be immersed in, you take it from the laboratory into the open world. I refer to that as AI in the open world. You learn a lot about the limitations of the AI. You also learn to ask questions and to extend these systems so they’re humble, they understand their limitations, they understand how accurate they are, you get them a level of self-knowledge. This is a whole area of open world intelligence that I think really reads upon some of the early questions for me about what humans are doing, what their minds are doing, and potentially other animals, vertebrates.

    It started there for me. Back to your question now, we are facing the same kind of things when we take an AI technology and put it in the hands of a judge who might make decisions about criminal justice looking at recommendations based on statistics to help him or her take an action. Now we have to realize we have systems we’re building that work with people. People want explanations. They don’t want to look at a black box with an indicator on it. They will say, why is this system telling me this?

    So at Microsoft we’ve made significant investments, both in our research team and in our engineering teams and in our policy groups at thinking through details of the problems and solutions when it comes to a set of problems, and I’ll just list a few right now. Safety and robustness of AI systems, transparency and intelligibility of these systems- can they explain themselves, bias and fairness, how can we build systems that are fair along certain dimensions, engineering best practices. Well, what does it mean for a team working with tools to understand how to build a system and maintain it over time so, that it’s trustworthy. Human AI collaboration – what are principles by which we can enable people to better work in a fluid way with systems that might be trying to augment their intelligence such that is a back and forth and understanding of when a system is not confident, for example. Even notions about attention and cognition is, are these systems being used in ways that might be favorable to advertisers, but they’re grabbing your attention and holding them on an application because they’ve learned how to do that mysteriously – should we have a point of view about that?

    So Microsoft Research has stood up teams looking at these questions. We also have stood up an ethics advisory board that we call the Aether Committee to deliberate and provide advice on hard questions that are coming up across the spectrum of these issues and providing guidance to our senior leadership team at Microsoft in how we do our business.

    Host: I know you were the co-founder of the Partnership on AI. Can you talk a little bit about that and what it sought to achieve?

    Eric Horvitz: This vision arose literally at conferences and, in fact, one of the key meetings was at a pub in New York City after meeting at NYU, where several computer scientists got together, all passionate about seeing it go well for artificial intelligence technologies by investing in understanding and addressing some of these rough edges and we decided we could bring together the large IT companies, Amazon, Apple, Facebook, Google, Microsoft to think together about what it might mean to build an organization that was a nonprofit that balanced the IT companies with groups in civil society, academic groups, nonprofit AI research to think through these challenges and come up with best practices in a way that brought the companies together rather than separating them through a competitive spirit. Actually this organization was created by the force of the friendships of AI Scientists, many of whom go back to being in grad school together across many universities, this invisible college of people united in an interesting understanding how to do AI in the open world.

    Host: Do you think there is a role for governments to play where policies governing AI are concerned, or do you think it’s best left to technology companies, individual thinkers and leaders to figure out what to do with AI?

    Eric Horvitz: Well, AI is evolving quickly and like other technologies governments have a significant role to play in assuring the safety of these technologies, their fairness, their appropriate uses. I see regulatory activity being of course largely in the hands of governments being advised by leadership in academia and in industry and the public which has a lot to say about these technologies.

    There’s been quite a bit of interest and activity, some of that is part of the enthusiastic energy, you might say, going into thinking through AI right now. Some people say there’s a hype-cycle that’s leaking everywhere and to all regimes, including governments right now, but it’s great to see various agencies writing documents, asking for advice, looking for sets of principles, publishing principles and engaging multi-stakeholder groups across the world.

    Host: There’s been a lot of talk and many conversations about the impact that AI can have on the common man. One of the areas of concern with AI spreading is the loss of jobs at a large scale. What’s your opinion on how AI is going to impact jobs?

    Eric Horvitz: My sense is there’s a lot of uncertainty about this, what kind of jobs will be created, what kinds of jobs will go away. If you take a segment like driving cars, I was surprised at how large a percentage of the US population makes their living driving trucks. Now, what if the long haul parts of truck driving, long highway stretches goes away when it becomes automated, it’s unclear what the ripples of that effect will be on society, on the economy. It’s interesting, there are various studies underway. I was involved in the international academy study looking at the potential effects of new kinds of automation coming via computer science and other related technologies and the results of that analysis was that we’re flying in the dark. We don’t have enough data to make these decisions yet or to make these recommendations or they have understandings about how things are going to go. So, we see people saying things on all sides right now.

    My own sense is that there’ll be some significant influences of AI on our daily lives and how we make our livings. But I’ll say one thing. One of my expectations and it’s maybe also a hope is that as we see more automation in the world and as that shifts in nature of what we do daily and what were paid to do or compensated to do what we call work, there’ll be certain aspects of human discourse that we simply will learn, for a variety of reasons, that we cannot automate, we aren’t able to automate or we shouldn’t automate, and the way I refer to this as in the midst of the rise of new kinds of automation some of which reading on tasks and abilities we would have in the past assumed was the realm of human intellect will see a concurrent rise of an economy of human around human caring. You think about this, humans will always want to make connection with humans, sociologists, social workers, physicians, teachers, we’re always going to want to make human connections and have human contacts.

    I think they’ll be amplified in a world of richer automation so much so that even when machines can generate art and write music, even music with lyrics that might put tear in someone’s eye if they didn’t know it was a machine, that will lead us to say, “Is that written by a human. I want to hear a song sung by a human who experienced something, the way I would experience something, not a machine.” And so I think human touch, human experience, human connection will grow even more important in a world of rising automation and those kinds of tasks and abilities will be even more compensated than they are today. So, we’ll see even more jobs in this realm of human caring.

    Host: Now, switching gears a bit, you’ve been in Microsoft Research for a long time. How have you seen MSR evolve over time and as a leader of the organization, what’s your vision for MSR over the next few years?

    Eric Horvitz: It’s been such an interesting journey. When I came to Microsoft Research it was 1992, and Rick Rashid and Nathan Myhrvold convinced me to stay along with two colleagues. We just came out of Stanford grad school we had ideas about going into academia. We came up to Microsoft to visit, we thought we were just here for a day to check things out, maybe seven or eight people that were then called Microsoft Research and we said, “Oh come on, please we didn’t really see a big future.” But somehow we took a risk and we loved this mission statement that starts with “Expand the state-of-the-art.” Period.

    Second part of the mission statement, “Transfer those technologies as fast as possible into real products and services.” Third part of the statement was, “Contribute to the vibrancy of this organization.” I remember seeing in my mind as we committed to doing this, trying it out- a vision of a lever with the fulcrum at the mountain top in the horizon. And I thought how can we make this company ours, our platform to take our ideas which then were bubbling. We had so many ideas about what we could do with AI from my graduate work and move the world, and that’s always been my sense for what Microsoft Research has been about. It’s a place where the top intellectual talent in the world, top scholars, often with entrepreneurial bents want to get something done can make Microsoft’s their platform for expressing their creativity and having real influence to enhancing the lives of millions of people.

    Host: Something I’ve heard for many years at Microsoft Research is that finding the right answer is not the biggest thing, what’s important is to ask the right, tough questions. And also that if you succeed in everything you do you are probably not taking enough risks. Does MSR continue to follow these philosophies?

    Eric Horvitz: Well, I’ve said three things about that. First of all, why should a large company have an organization like Microsoft Research? It’s unique. We don’t see that even in competitors. Most competitors are taking experts if they could attract them and they’re embedding them in product teams. Microsoft has had the foresight and we’re reaching 30 years now since we kicked off Microsoft Research to say, if we take top talent and attract this top talent into the company and we give these people time and we familiarize them with many of our problems and aspirations, they can not only come up with new ideas, out-of-the-box directions, they can also provide new kinds of leadership to the company as a whole, setting its direction, providing a weathervane, looking out to the late-breaking changes on the frontiers of computer science and other sciences and helping to shape Microsoft in the world, versus, for example, helping a specific product team do better with an existing current conception of what a product should be.

    Host: Do you see this role of Microsoft Research changing over the next few years?

    Eric Horvitz: Microsoft has changed over its history and one of my interests and my reflections and I shared this in an all-hands meeting just last night with MSR India. In fact, they tried out some new ideas coming out of a retreat that the leadership team from Microsoft Research had in December – just a few months ago, is how might we continue to think and reflect about being the best we can, given who we are. I’ve called it polishing the gem, not breaking it but polishing, buffing it out, thinking about what we can do with it to make ourselves even more effective in the world.

    One trend we’ve seen at Microsoft is that over the years we’ve gone from Microsoft Research, this separate tower of intellectual depth reaching out into the company in a variety of ways, forming teams, advising, working with outside agencies, with students in the world, with universities to a larger ecosystem of research at Microsoft, where we have pockets or advanced technology groups around the company doing great work and in some ways doing the kinds of things that Microsoft Research used to be doing, or solely doing at Microsoft in some ways.

    So we see that upping the game as to what a center of excellence should be doing. I’m just asking the question right now, what are our deep strengths, this notion of deep scholarship, deep ability, how can we best leverage that for the world and for the company, and how can we work with other teams in a larger R&D ecosystem, which has come to be at Microsoft?

    Host: You’ve been at the India Lab for a couple of days now. How has the trip been and what do you think of the work that the lab in India is doing?

    Eric Horvitz: You know we just hit 15 here – 15 years old so this lab is just getting out of adolescence- that’s a teenager. It seems like just yesterday when I was sitting with the Anandan, the first director of this lab looking at a one-pager that he had written about “Standing up a lab in India.” I was sitting in Redmond’s and having coffee and I tell you that was a fast 15 years, but it’s been great to see what this lab became and what it does. Each of our labs is unique in so many ways typically based on the culture it’s immersed in.

    The India lab is famous for its deep theoretical chops and fabulous theorists here, the best in the world. This interdisciplinary spirit of taking theory and melding it with real-world challenges to create incredible new kinds of services and software. One of the marquee areas of this lab has been this notion of taking a hard look and insightful gaze at emerging markets, Indian culture all up and thinking about how computing and computing platforms and communications can be harnessed in a variety of ways to enhance the lives of people, how can they be better educated, how can we make farms, agriculture be more efficient and productive, how can we think about new economic models, new kinds of jobs, how can we leverage new notions of what it means to do freelance or gig work. So the lab has its own feel, its own texture, and when I immerse myself in it for a few days I just love getting familiar with the latest new hires, the new research fellows, the young folks coming out of undergrad that are just bright-eyed and inject energy into this place.

    So I find Microsoft Research India to have a unique combination of talented researchers and engineers that brings to the table some of the deepest theory in the world’s theoretical understandings of hard computer science, including challenges with understanding the foundations of AI systems. There’s a lot of work going on right now. Machine learning as we discussed earlier, but we don’t have a deep understanding, for example, of how these neural network systems work and why they’re working so well and I just came out of a meeting where folks in this lab have come up with some of the first insights into why some of these procedures are working so well to understand that and understand their limitations and which ways to go and how to guide that, how to navigate these problems is rare and it takes a deep focus and ability to understand the complexity arising in these representations and methods.

    At the same time, we have the same kind of focus and intensity with a gaze at culture at emerging markets. There are some grand challenges with understanding the role of technology in society when it comes to a complex civilization, or I should say set of civilizations like we see in India today. This mix of futuristic, out-of-the-box advanced technology with rural farms, classical ways of doing things, meshing the old and the new and so many differences as you move from province to province, state to state, and these sociologists and practitioners that are looking carefully at ethnography, epidemiology, sociology, coupled with computer science are doing fabulous things here at the Microsoft Research India Lab. Even coming up with new thinking about how we can mesh opportunistic Wi-Fi with sneakers, Sneakernet and people walking around to share large amounts of data. I don’t think that project would have arisen anywhere, but at this lab.

    Host: Right. So you’ve again teed-up my next question perfectly. As you said India’s a very complex place in terms of societal inequities and wealth inequalities.

    Eric Horvitz: And technical inequality, it’s amazing how different things are from place to place.

    Host: That’s right. So, what do you think India can do to utilize AI better and do you think India is a place that can generate new innovative kinds of AI?

    Eric Horvitz: Well, absolutely, the latter is going to be true, because some of the best talent in computer science in the world is being educated and is working in this, in this country, so of course we will see fabulous things, fabulous innovations being originating in India in both in the universities and in research labs, including Microsoft Research. As to how to harness these technologies, you know, it takes a special skill to look at the currently available capabilities in a constellation of technologies and to think deeply about how to take them into the open world into the real world, the complex messy world.

    It often takes insights as well as a very caring team of people to stick with an idea and to try things out and to watch it and to nurture it and to involve multiple stakeholders in watching over time for example, even how a deployment works, gathering data about it and so on. So, I think some very promising areas include healthcare. There are some sets of illnesses that are low-hanging fruit for early detection and diagnosis, understanding where we could intervene early on by looking at pre-diabetes states for example and guiding patients early on to getting care to not go into more serious pathophysiologies, understanding when someone needs to be hospitalized, how long they should be hospitalized in a resource limited realm, we have to sort of selectively allocate resources, doing them more optimally can lead to great effects.

    This idea of understanding education, how to educate people, how to engage them over time, diagnosing which students might drop out early on and alerting teachers to invest more effort, understanding when students don’t understand something and automatically helping them get through a hard concept. We’re seeing interesting breakthroughs now in tutoring systems that can detect these states. Transportation – I mean, it’s funny we build systems in the United States and this what I was doing to predict traffic and to route cars ideally. Then we come to India and we look at the streets here we say, “I don’t think so, we need a different approach,” but it just raises the stakes on how we can apply AI in new ways. So, the big pillars are education, healthcare, transportation, even understanding how to guide resources and allocations in the economy. I think we’ll see big effects of insightful applications in this country.

    Host: This has been a very interesting conversation. Before we finish do you want to leave us with some final thoughts?

    Eric Horvitz: Maybe I’ll make a call out to young folks who are thinking about their careers and what they might want to do and to assure them that it’s worth it. It’s worth investing in taking your classes seriously, in asking lots of questions, in having your curiosities addressed by your teachers and your colleagues, family. There’s so much excitement and fun in doing research and development, in being able to build things and feel them and see how they work in the world, and maybe mostly being able to take ideas into reality in ways that you can see the output of your efforts and ideas really delivering value to people in the world.

    Host: That was a great conversation, Eric. Thank you!

    Eric Horvitz: Thank you, it’s been fun.

Microsoft Research India Podcast – Podcast (2024)
Top Articles
Latest Posts
Recommended Articles
Article information

Author: Corie Satterfield

Last Updated:

Views: 6066

Rating: 4.1 / 5 (42 voted)

Reviews: 89% of readers found this page helpful

Author information

Name: Corie Satterfield

Birthday: 1992-08-19

Address: 850 Benjamin Bridge, Dickinsonchester, CO 68572-0542

Phone: +26813599986666

Job: Sales Manager

Hobby: Table tennis, Soapmaking, Flower arranging, amateur radio, Rock climbing, scrapbook, Horseback riding

Introduction: My name is Corie Satterfield, I am a fancy, perfect, spotless, quaint, fantastic, funny, lucky person who loves writing and wants to share my knowledge and understanding with you.