Microsoft has announced it has built ‘one of the top five publicly disclosed supercomputers’ in the world in collaboration with – and exclusively for – OpenAI. This is the result of a partnership announced last year, which saw Microsoft and OpenAI come together to utilize the Azure platform for large-scale AI purposes.
The deal, which included a $1 billion investment by Microsoft, was envisaged as an ‘exclusive computing partnership’ – encompassing joint efforts on building new Azure supercomputing technologies, porting OpenAI services to the Azure platform, and Microsoft winning ‘preferred partner’ status.
OpenAI, to refresh your memory, were the ones to come up with the GPT-2 language model and synthetic text generator, which displayed such adeptness at generating copy that it creators, fearing it could worsen the fake news epidemic, refused to release the full model to the public (it was eventually released in November 2019).
Focus on large-scale AI
According to Microsoft, the new supercomputer (which consists of 285,000 CPU cores and 10,000 GPUs) will be used to create large-scale AI models, which are more powerful than smaller models and can be fine-tuned for specific tasks. As Microsoft CTO Kevin Scott explained, “This is about taking a very broad set of data and training a model that learns to do a general set of things and making that model available for millions of developers to go figure out how to do interesting and creative things with.”
Large-scale AI models are considered especially useful for natural language processing, as using a far larger set of publicly available texts allows the model to pick up the nuances that mark our use of language. According to experts, these ‘self-supervised’ models, trained on a wide range of text written by humans, might be more effective at understanding how we use language than the ‘traditional’ methods of training AI models on smaller, more specific human-labelled sets of data.
“This has enabled things that were seemingly impossible with smaller models,” said Luis Vargas, who heads Microsoft’s AI at Scale initiative. Once the basic skills have been learnt, industry-specific skills can be added later. “Because every organization is going to have its own vocabulary, people can now easily fine tune that model to give it a graduate degree in understanding business, healthcare or legal domains,” Vargas added.