Generative AI Series 1, Silicon Valley: Fellow Discussions on GAI Trends and Opportunities
Vijay Narayanan, General Partner at Fellows Fund
In January, Fellows Fund brought together a diverse group of technology leaders from Google, OpenAI, Zoom, Instacart, Apple, Roblox, LinkedIn, Stanford, and UC Berkeley, and entrepreneurs in several domains, including AI research, gaming, legal, enterprise communications, digital marketing, healthcare, education, and consumer tech, to share thoughts on the opportunities and applications that will be ignited by the rapidly evolving generative AI technologies (GAI). It was an exciting panel that discussed a wide-ranging set of topics across GAI infrastructure, frameworks, experiences, and applications.
The key takeaways from the discussion are:
By reducing the barriers to generating quality content in and across natural and software languages and across modalities like images, video, audio, etc. GAI enhances human communication with each other and with machines. As a result of GAI, we are able to translate ideas into text, software, and visual and audio artifacts, and iterate on them much faster than before, which enables us to create high-quality content on a large scale much more quickly.
Research and discovery in a wide variety of industries will be greatly accelerated by GAI. For example, copywriting, design, gaming, drug discovery and life sciences, materials science, and many more will take advantage of GAI to find promising modifications to existing products or discover products that have a high utility value.
We believe over the next 3 years, while there will be a few large general-purpose GAI models in both open and closed-source modes that can generate content in any domain, a lot of value from the GAI technology will be captured by the infrastructure companies hosting these large models and application companies curating domain-specific datasets and implementing targeted data acquisition strategies with iterative closed-loop experiments.
The panel shared recent evolutionary trends in GAI technology and its impact on different domains, with each panelist identifying one or more applications in their domain that will be transformed by GAI. These applications were broadly in 3 main categories:
Improving the quality of human interactions through language
Spoken language and written text are dominant modes of human communication in both personal and professional settings.
AI-enabled writing: A major use of GAI is to help increase the productivity of writers by making it easier to generate quality content with the right syntax and different narrative styles and formats. In recent years, a number of startups that use AI to generate content (AIGC) have been founded to address the variety of writing use cases, such as emails for formal/informal communication, documentation for more formal specifications and functional descriptions, blogs for personal commentary or business content that is frequently updated, marketing content and ads, a summary of a document or conversation, etc. A few of these include Jasper - an AI content generator for teams, Writesonic - an AI writer for blogs, ads, emails, etc., Durable - an AI website builder, Omneky - creates personalized ads, Abridge - AI for medical conversation, etc.
Language Translation and Transcription: Along with content creation, GAI enables high-quality translation of speech from one language to another (where content is created in the target language while retaining the meaning of the text in the source language), reducing the barrier to human interaction across languages, (e.g. Microsoft translator - comprehensive translation platform for personal, business and educational, Viva translate - real-time language translator etc.)
Human interaction with machines using natural language: Computers use programs written in high-level coding languages with specific grammar and syntax. While this has enabled software development as a new job category, it limits the number of people who can create new applications using computers. GAI technologies are now increasingly enabling humans to interact with computers using natural language - where a description of the code in plain text is converted to computer code (e.g. Github copilot - AI pair programmer, Repl.it - code generator, Tabnine - AI code-completion tool, Codacy - automated code review tool), or a description of data analysis is converted into SQL code to interact with databases (e.g. seek and nlsql)
Amplifying human creativity
A unique capability of GAI technology is to illustrate a concept in multiple modalities - text, images, videos, audio etc. and translate across these modalities using capabilities like text to image, text to 3d, text to plan, image to captions etc. This capability empowers creators and designers to describe their design ideas in natural language, generate multiple candidate artifacts in varying modalities like images, videos, audio etc. and iterate on these artifacts to converge on a high-quality output quickly (e.g. drawthings.ai to generate images from text, AssemblyAI - audio to text, Synthesia - create videos from a text in different languages etc.). This increased speed of iteration from idea to artifact is enabling a number of applications in areas like generating new story ideas, designing and evolving games (e.g. Inworld - create and chat with AI characters, Twelve Labs - video understanding AI), designing real estate plans and legal search (e.g Swapp - create construction documents) etc., at a rapid pace.
Fellows concurred that the gaming and movie industries will be the initial ones impacted by GAI technologies. Applications include converting text to 2D, text to 3D, text to SQL, and utilizing similar technologies like ChatGPT to generate movie scripts.
Accelerating research, discovery, and development
Research and discovery are part of every aspect of our lives, from finding appropriate gifts for a milestone birthday to understanding our company's work-from-home policy to more sophisticated discovery problems like building small molecules that are effective, safe and efficient at binding to specific disease targets in therapeutics. These processes are increasingly guided by GAI technologies, especially when the space of potential outputs is large (e.g. screening from more than 100 billion small molecules) and constrained (e.g. low toxicity to humans). GAI technologies are now also empowering computers to understand and summarize a piece of text and answer questions on the content in a document corpus, making it easier and faster to identify and understand relevant content. Similarly, generative models of proteins are now learning the 3-d structures and sequences of proteins in a generalizable manner and creating new protein molecule candidates based on geometric and functional instructions (e.g. Chroma from Generate Biomedicines) to target specific diseases. Unlearn.AI is an AI-first company using GAI to build digital twins of patients to accelerate clinical trials in drug discovery and development, with the broader mission to use AI to eliminate trial and error in medicine.
The current GAI technologies often build upon large AI models called foundation models (known as large language models - LLMs when limited to text data), that are trained using a large corpus of data to encode patterns and relationships in the training data. The relationships learned in these foundation models are then used to generate outputs to new inputs. While there are a few large foundation models like GPT-3, DALL-E, Flamingo, etc. that are trained on a general corpus, there are now increasingly more domain-specific models like the Genomic LLM trained on genomic data, AlphaFold and ESM-2 trained on proteins, and ChemBERTa-2 on molecular data that are more performant on domain-specific applications. Another rising use of generative AI is to generate synthetic datasets with realistic data distributions to augment training data for other AI models (e.g. Gretel.ai - developer stack for synthetic data, Tonic.ai - AI generated realistic fake data, Mostly AI - synthetic data platform) to accelerate the development of AI applications.
The technologies underpinning Generative AI are evolving rapidly. In light of this, some important and interesting questions arise on their trajectory and potential impact over the next 3 years.
Where will the majority of GAI run in 3 years - in the cloud, in the edge device (mobile, sensors, etc.), or both?
There are tens to hundreds of billions of parameters in today's GAI technologies, with the larger models typically being more performant. Despite the growing research and effort to optimize the size and performance of these models and make them run in multiple environments with different compute capabilities (e.g. OmniML - Hardware-aware AI, SliceX AI - intelligence engine for cloud and edge), the sizes of performant models in the near future are likely to remain too large to fit into the smaller memory limits (100s of MBs) for applications in widely used mobile devices. Hence, we believe that most GAI applications in the next 3 years will either run on the cloud, or in a hybrid mode where some experiences are enabled on the cloud and smaller modifications execute on the device.
Will there be a single foundation model for all domains, or will domain-specific models continue to be relevant?
While the foundation models like GPT will continue to grow and become more performant across a wide range of domains, especially in zero and few shot settings, we believe that generative AI from models based on domain specific datasets will continue to be superior in performance at a fixed model size. The foundation model can be fine tuned with domain specific data to enhance the performance at similar scale. This is encouraging for entrepreneurs building and curating corpus of domain specific datasets and implementing targeted data acquisition strategies with iterative closed-loop experiments.
There are no autonomous self-driving cars in the mass market yet, although the technology has been under development for almost a decade. Will GAI follow in the footsteps of self-driving cars?
We feel that GAI is significantly different from self-driving and expect GAI applications to proliferate in a number of domains in the near 3 years. Specifically,
Fully autonomous driving requires self-driving technology to be robust and reliable enough to make safe decisions when confronted with rare and novel situations on the road. However, the bar on GAI, on the other hand, is lower. GAI applications are expected to add value to existing, lower-risk tasks that humans and machines perform today to be useful. i.e. GAI applications are valuable even when the technology is not always accurate, whereas fully autonomous self-driving is valuable only when it is reliable even in rare and unforeseen circumstances.
A number of GAI applications have proven effective when the technology is in assistance of a human or machine, while the goal of fully autonomous self-driving is to replace a human driver.
What will be the largest impact of GAI in the next 3 years?
The panel identified a few areas:
To enable more people with ideas to articulate themselves more effectively and expeditiously by making text and language communication universally accessible and comprehensible by both humans and computers, and translate those ideas into artifacts in multiple modalities of text, image, videos, audio etc.
To accelerate productivity by automating low complexity tasks in tertiary (teaching, nursing etc.) and specialized quaternary jobs (e.g. research and development)
To assist in high end professional tasks like for e.g. AI assisted coaching of patients to improve their mental health, game design, drug design, software and app design etc.
I appreciate the ideas and contributions from all fellows and special thanks to Daisy Zhao and Julia Zhu for supporting me in writing this article.
To join our fellow discussion on Generative AI, please apply to be a member of Fellowscollective.com, which is a startup fellow community built by Fellows Fund fellows.