Google Research in 2022 and Beyond

(T) As he did since 2017, Jeff Dean, the tech lead of the Google Research and Health Teams shared some of the key achievements of his team for 2022 in a blog post today:

This year’s blog post is focus on a few themes only, and has much less breadth than previous blog posts, but expect seven other posts covering additional areas to come soon:

The section on language models is particular paramount this year, as OpenAI seems to lead the innovation race over Google and DeepMind in language models. The blog post does not reveal anything that Google has not already published on language models. The section on the Chain of Thought and its application to mathematical reasoning and clinical knowledge is probably the most interesting to read:

One of the broad key challenges in artificial intelligence is to build systems that can perform multi-step reasoning, learning to break down complex problems into smaller tasks and combining solutions to those to address the larger problem. Our recent work on Chain of Thought prompting, whereby the model is encouraged to “show its work” in solving new problems (similar to how your fourth-grade math teacher encouraged you to show the steps involved in solving a problem, rather than just writing down the answer you came up with), helps language models follow a logical chain of thought and generate more structured, organized and accurate responses. Like the fourth-grade math student that shows their work, not only does this make the problem-solving approach much more interpretable, it is also more likely that the correct answer will be found for complex problems that require multiple steps of reasoning.

One of the areas where multi-step reasoning is most clearly beneficial and measurable is in the ability of models to solve complex mathematical reasoning and scientific problems. A key research question is whether ML models can learn to solve complex problems using multi-step reasoning. By taking the general-purpose PaLM language model and fine-tuning it on a large corpus of mathematical documents and scientific research papers from arXiv, and then using Chain of Thought prompting and self-consistency decoding, the Minerva effort was able to demonstrate substantial improvements over the state-of-the-art for mathematical reasoning and scientific problems across a wide variety of scientific and mathematical benchmark suites.

Chain of Thought prompting is one way of better-expressing natural language prompts and examples to a model to improve its ability to tackle new tasks. The similar learned prompt tuning, in which a large language model is fine-tuned on a corpus of problem-domain–specific text, has shown great promise. In “Large Language Models Encode Clinical Knowledge”, we demonstrated that learned prompt tuning can adapt a general-purpose language model to the medical domain with relatively few examples and that the resulting model can achieve 67.6% accuracy on US Medical License Exam questions (MedQA), surpassing the prior ML state-of-the-art by over 17%. While still short compared to the abilities of clinicians, comprehension, recall of knowledge and medical reasoning all improve with model scale and instruction prompt tuning, suggesting the potential utility of LLMs in medicine. Continued work can help to create safe, helpful language models for clinical application.”

Note: The picture above is a painting from Claude Monet “prairie avec meules de foin près de Giverny, 1885” on displayed at the Museum of Fine Arts in Boston.

Copyright © 2005-2023 by Serge-Paul Carrasco. All rights reserved.
Contact Us: asvinsider at gmail dot com.