In this fourth part of Generative AI, specifically aimed for organizational use, I’m focussing on Azure OpenAI. Microsofts solution to using Generative AI features, securely within your own tenant and company/privacy policies.
Overview of Azure OpenAI Service
Azure OpenAI provides a cloud-based service for interacting with OpenAI models like GPT-3, GPT-4, DALL-E, and Codex.
It supports the latest GPT-4 models, including GPT-4 and GPT-4-32K.
The service is designed to offer advanced language AI with the security and enterprise promise of Azure, co-developing the APIs with OpenAI for smooth transitions between the versions.
Azure OpenAI doesn’t use customer data to retrain models, ensuring data privacy and security.
Understanding Tokens and Quotas
In the context of Generative AI, a token typically refers to a unit of data that the model reads during training and generation. Here’s a detailed breakdown:
A token can be as small as a character or as long as a word. In some languages or models, it could even represent a syllable or a phrase. The way text is tokenized can significantly impact the model’s performance and the types of patterns it can learn.
Tokenization is the process of splitting a text into tokens. This is a crucial step as it affects how the AI model will interpret the text. For example, the sentence “ChatGPT is great!” could be tokenized into three tokens: [“ChatGPT”, ” is”, ” great!”], but it could also be something more detailed such as: [“Chat”, “GPT”, ” is”, ” great”, “!”].
Once text is tokenized, each token is typically mapped to a unique identifier (a token ID) based on a vocabulary that the model has learned. These IDs are used internally by the model during training and generation.
The token IDs are often used to look up embeddings – high-dimensional vectors that represent each token in a continuous vector space. These embeddings are learned during training and capture semantic relationships between tokens.
In generative models like GPT, tokens are processed sequentially, and the model learns to predict the next token based on the previous tokens.
During generation, the model uses the tokens in the provided text as context to generate coherent and contextually relevant text. The generated text itself consists of a sequence of tokens which are then detokenized into human-readable text.
This way, tokens play a crucial role in both the training and generation phases of generative AI models, impacting how these models understand and generate text.
The Azure Open AI service has set rate limits on requests per minute (RPM) and tokens per minute (TPM) to manage the usage.
Both free trial and pay-as-you-go subscriptions have different rate limits.
The rate limits are further segregated based on the type of model being accessed (e.g., GPT-3, GPT-3.5 Turbo, etc.)
Recent Updates (as of September 2023)
GPT-4 and GPT-4-32K are now available to all Azure OpenAI Service customers without the need for waitlist application.
Support for GPT-3.5 Turbo Instruct model and Whisper (speech to text) model have been introduced.
The service has been extended to new regions like Sweden Central, Switzerland North, Canada East, East US 2, Japan East, and North Central US among others.
There have been updates regarding the support for function calling, embedding input array increase, and availability in new regions.
Azure OpenAI Service now supports deploying on user data to Power Virtual Agents and has introduced private endpoints, among other updates in the preview phase.
The combination of Cognitive Search, Blob Storage and Azure OpenAI
Azure OpenAI, together with Azure Cognitive Search and Blob Storage, creates a strong trio that helps manage and find data easily within an organization. Azure Cognitive Search helps in finding specific information quickly even from a huge amount of data, while Azure Blob Storage is like a big digital storage room where a lot of data can be kept safely.
When Azure OpenAI is added to this mix, it helps in automatically understanding and analyzing the data stored in Blob Storage, providing useful information that can help in making smart decisions.
Also, when Azure OpenAI works with Azure Cognitive Search, it helps in creating a smart search system that can understand and respond to questions asked in simple language, delivering more accurate answers using Power Virtual Agents.
This combination not only makes handling data easier but also helps in getting more value from the data, which is very helpful for organizations looking to improve their project management and data analysis tasks.
Final notes
That’s enough on the Generative AI specifically for organizations I believe. At least, until Microsoft Co-pilot is available in my organization. Then, we can talk more about Semantic Index, Plug-ins etc.
In the mean time, I published a new video for the first time in 70+ days! Here it is:
And seeing as it’s already the 5th year I’m doing videos on YouTube. I thought it would be a great time to ask your input. Am I still on the right track? Should I change my content? Please help, by filling in the short form.