Generative AI Offerings and LLMs: A Comparison of Google vs Azure vs Amazon
June 5, 2023
In recent years, the field of cloud computing has experienced exponential growth, revolutionizing the way businesses operate and the services they provide. Leading this digital transformation are three prominent cloud service providers: Amazon Web Services (AWS),Microsoft Azure, and Google Cloud Platform (GCP). Within their vast array of services, these providers offer Language and Machine Learning (LLM) capabilities, empowering businesses to harness the power of natural language processing, data analysis, and predictive modeling.
This blog post endeavors to conduct a thorough examination and comparison of the LLM (Machine Learning and Artificial Intelligence) offerings provided by AWS, Azure, and GCP. The primary objective is to shed light on the individual merits, drawbacks, and applicability of these platforms across different use cases. By meticulously evaluating the features, performance, scalability, and user-friendliness of these platforms, the intention is to equip users and enterprises with the necessary insights to make well-informed decisions when considering the adoption of LLM solutions.
Amazon Bedrock is a managed service provided by AWS that offers access to a range of Fully Managed Models (FMs) from both AI startups and Amazon itself via an API. The service aims to simplify the development of generative AI applications by allowing users to choose the most suitable FM for their needs without having to manage infrastructure. Bedrock provides a serverless experience, enabling quick start-up, customization with user data, and integration with familiar AWS tools.
Benefits of Amazon Bedrock:
It includes accelerated development of generative AI applications, the ability to choose from FMs developed by AI21 Labs, Anthropic, Stability AI, and Amazon, and the convenience of deploying scalable and reliable applications using familiar AWS tools. Bedrock caters to various use cases such as text generation, chatbots, search, text summarization, image generation, and personalization.
Bedrock offers a range of foundation models, including AI21’s Jurassic-2 model for multilingual text generation, Anthropic’s Claude model for conversations and question answering, Stable Diffusion for generating realistic images, art, logos, and designs, and Amazon Titan for various text-related tasks such as summarization, generation, classification, open-ended Q&A, information extraction, and embeddings. Amazon Titan is comparable to OpenAI's GPT-4, but its performance may not be equivalent.
According to VP of generative AI at AWS, Vasi Philomen, the model is similar to the algorithm that powers searches on amazon.
Amazon Bedrock aims to provide developers with a range of FMs and infrastructure support to facilitate the creation of innovative AI applications, enhancing user experiences based on specific requirements.
Currently, Bedrock is available only after application to an early access waitlist as it is still in development. In addition, no formal pricing has been announced so it is hard to determine how the cost/performance ratio of the service will compare to other cloud providers, but an in-house AWS service supporting generative AI can be more easily integrable with existing applications already running on AWS.
AWS also has a generative AI-based code completion service, called AWS CodeWhisperer which is comparable to GitHub Copilot. While Copilot is only free for students, teachers, and maintainers of popular open source projects, CodeWhisperer includes an individual tier that provides unlimited code suggestions and reference tracking completely free of use.
Microsoft Azure OpenAI:
In January 2023, Microsoft announced a significant investment in its partnership with OpenAI, marking the third phase of their collaboration. The multiyear, multibillion-dollar investment aims to accelerate AI breakthroughs and ensure the benefits of AI are accessible to a wide range of people.
The agreement allows both Microsoft and OpenAI to independently commercialize the advanced AI technologies resulting from their collaboration. Microsoft will increase its investments in the development and deployment of specialized supercomputing systems to support OpenAI's independent AI research. Azure's AI infrastructure will also be expanded to enable customers to build and deploy AI applications globally.
As part of the partnership, Microsoft will deploy OpenAI's models in its consumer and enterprise products, introducing new categories of digital experiences. This includes the Azure OpenAI Service, which provides developers with access to OpenAI models backed by Azure's trusted, enterprise-grade capabilities and infrastructure.
Azure will serve as OpenAI's exclusive cloud provider, powering all of its workloads across research, products, and API services. Both companies emphasize their commitment to responsible AI research and development, aiming to create AI systems and products that are trustworthy and safe. Their joint efforts have already resulted in significant achievements, such as the development of breakthrough AI models and the deployment of AI-powered products.
OpenAI and Azure OpenAI offer four variants of Generative Pre-trained Transformer (GPT) models:
Ada, Babbage, Curie, and Davinci. These models vary in terms of their parameters, training data size, and the range of tasks they can perform.
Ada is the smallest model with 350 million parameters and 40GB of text data. It can handle basic natural language tasks like classification, sentiment analysis, summarization, and simple conversation.
Babbage is a larger model with 3 billion parameters and 300GB of text data. It can handle more complex natural language tasks such as reasoning, logic, arithmetic, and word analogy.
Curie is a very large model with 13 billion parameters and 800GB of text data. It is capable of handling advanced natural language tasks such as text-to-speech, speech-to-text, translation, paraphrasing, and question answering.
Davinci is the largest and most powerful model with 175 billion parameters and 45TB of text data. It can do almost any natural language task and can also handle some multimodal tasks like image captioning, style transfer, and visual reasoning. Additionally, Davinci can generate coherent and creative texts on various topics, demonstrating high fluency, consistency, and diversity.
The cost of using a generative language model on Azure is measured per 1,000 tokens.
Tokenization is the process of dividing text into smaller units called tokens for AI model processing. Tokens can be words, characters, subwords, or symbols. It helps handle different languages, reduce costs, and impact text quality. Tokenization methods vary based on text complexity.
OpenAI and Azure OpenAI use Byte-Pair Encoding (BPE) for GPT models. BPE merges character/byte pairs into tokens, handling rare words and creating compact representations. Different models have varying vocab sizes (e.g., Ada: 50k, Davinci: 60k). Larger vocabs yield diverse texts but require more resources. Vocabulary size choice balances quality and efficiency.
Tokenization impacts model costs by affecting data volume and computational requirements. OpenAI and Azure OpenAI models have varying rates per 1,000 tokens. For example, Davinci costs $0.06, while Ada costs $0.0008. Rates also differ based on usage types. Tokenization plays a crucial role in cost and performance considerations for these models.
Microsoft's open-source project, Semantic Kernel (SK), is a lightweight SDK that seamlessly integrates AI Large Language Models (LLMs) with traditional programming languages. SK combines natural language semantic functions, native code functions, and embeddings-based memory, enabling new possibilities and enhancing the value of applications through AI capabilities.
SK provides several features right out of the box, including prompt templating, function chaining, vectorized memory, and intelligent planning capabilities. These features enable developers to leverage the potential of LLMs seamlessly.
The Semantic Kernel integrates advanced AI design patterns derived from the latest research. Developers can enhance their applications using plugins for various capabilities like prompt chaining, summarization, zero/few-shot learning, embeddings, and more. This encapsulation of design patterns enables the infusion of advanced AI capabilities into applications built with SK.
Google Vertex AI:
Vertex AI is a technology developed by Google Cloud that enables the deployment of large language models (LLMs) in production services. It provides a solution for integrating LLMs or AI chatbots with existing IT systems, databases, and business data.
One of the key challenges in deploying LLMs is ensuring that they can accurately understand and interpret the meaning and intent of text inputs. Vertex AI claims to address this challenge by introducing the concept of grounding with embeddings and vector search.
Grounding refers to connecting LLM outputs to real business facts and data, ensuring that the generated responses are relevant and reliable. This is achieved by leveraging embeddings, which are representations of text in a high-dimensional space that capture the meaning and semantics of the text. Vertex AI provides Embeddings for Text, an API that generates 768-dimensional text embeddings from input text.
The embeddings generated by Vertex AI Embeddings for Text can be used for various text processing tasks, such as semantic search, text classification, recommendation systems, clustering, anomaly detection, sentiment analysis, and more. These tasks can be performed with a deep understanding of the text's context and nuances, thanks to the LLM's capabilities.
To enable efficient and fast search through the embedding space, Vertex AI incorporates vector search technology. Vector search involves calculating the distance or similarity between vectors to find similar embeddings. Google Cloud's vector search technology, powered by an algorithm called ScaNN, enables fast and scalable search even with millions or billions of embeddings.
With the combination of Vertex AI Embeddings for Text and the Matching Engine, developers can connect LLM outputs to real business data. The Matching Engine enables fast vector searches and indexing of embeddings. This grounding of LLM outputs with embeddings and vector search ensures accurate and relevant responses, making LLMs reliable for enterprise use.
The pricing for Generative AI support on Vertex AI is based on the number of characters in both the input (prompt) and output (response) of the prediction request. The character count is determined by considering UTF-8 code points, and white space is not included in the count.
Here's a table summarizing the pros and cons of each AI platform:
1. Integration with Google Cloud ecosystem
1. Limited range of pre-trained LLM models
2. Scalable infrastructure
2. Pricing based on input/output response sizes
3. User-friendly interface and intuitive tools
3. Specific dependencies on Google Cloud ecosystem
4. Grounding and embeddings for accurate understanding
5. Fast and scalable search through vector search
1. Multiple variants of GPT models
1. Cost and complexity related to tokenization
2. Tight integration with Microsoft Azure
2. Pricing variations based on usage type
3. Access to OpenAI models backed by Azure's infrastructure
3. Choosing the most appropriate GPT variant requires consideration of specific requirements
4. Dependency on Azure cloud platform
1. Diverse foundation models for various use cases
1. Early access waitlist and undisclosed pricing
2. Seamless integration with AWS services
2. Performance comparisons with other LLMs not provided
3. Integration with existing AWS applications
3. Vendor lock-in with AWS ecosystem
4. Code completion service with AWS CodeWhisperer
All three platforms (Vertex AI, Azure OpenAI, Amazon Bedrock) provide AI services and solutions. They offer pre-trained language models for various natural language processing tasks. Integration with their respective cloud ecosystems is a key feature, allowing users to leverage other services and tools within the same platform. Each platform aims to provide user-friendly interfaces and tools for model development, deployment, and management.
Pre-trained models vary across platforms, impacting their suitability for specific use cases. Pricing structures differ based on factors like response sizes, tokenization, and usage types. Integration may be affected by the platform's tie to a specific cloud provider. Each platform offers unique features: Vertex AI focuses on grounding and embeddings, Azure OpenAI provides multiple GPT variants, and Amazon Bedrock offers diverse foundation models.
We hope you found our blog post informative. If you have any project inquiries or would like to discuss your data and analytics needs, please don't hesitate to contact us at firstname.lastname@example.org. We're here to help! Thank you for reading.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.