Increase the performance of processing models AI | Blog Microsoft Azure

See how a vice -wide approach works and companies have successfully implemented this approach to increase performance and cost reduction.

Using the strengths of various artificial intelligence models and their connection to a single application can be a great strategy that will help you meet your performance goals. This approach uses the power of multiple AI systems to improve accuracy and reliability in complex scenarios.

More than 1,800 artificial intelligence models are available in Microsoft’s catalog. Even more models and services are available through Azure Openai Service and Azure AI Foundry, so you can find the right models to create an optimal AI solution.

Let’s see how the Vícemodel approach works and explore some scenarios where companies have successfully implemented this approach to increase performance and reduce costs.

How does the Vinodel approach work

The vice -view approach includes combining various artificial intelligence models to make more effective tasks. Models are trained for different tasks or aspects of a problem such as language understanding, image recognition or data analysis. Models can work in parallel and process different parts of input data simultaneously, direct to relevant models, or can be used in the application in different ways.

Suppose you want to pair the tuned vision model with a large language model so that you can perform some complex tasks of display classification in conjunction with questions in natural language. Or maybe you have a small model tuned to generate SQL queries in your database scheme, and you would like to pair it with a larger model for more general tasks such as getting information and research help. In both cases, a vice -view approach could offer you adaptability to create a comprehensive AI solution that meets the specific requirements of your organization.

Before implementing vice -through strategy

First, identify and understand the result you want to achieve, because it is key to selecting and deploying the right artificial intelligence models. In addition, each model has its own set of strengths and challenges that need to be considered so that you can choose the right ones for your goals. Before implementing a vice -medical strategy, several things need to be considered, including:

The intended purpose of the models.
Application requirements for model size.
Training and management of specialized models.
The necessary different degrees of accuracy.
Control of application and models.
Safety and distortion of potential models.
Cost of models and expected cost of scale.
The correct programming language (up -to -date information about the best languages for use with specific models can be found in DevqualityEval).

The weight you assign to each criterion will depend on factors such as your goals, technology magazine, resources and other variables specific to your organization.

Let’s look at some scenarios as well as several customers who have implemented more models into their workflows.

Scenario 1: Direction

Routing is a situation where AI and machine learning technologies optimize the most effective ways for use, such as call centers, logistics and more. Here are some examples:

Multimodal routing for different data processing

One of the innovative applications of multiple models is the current routing of tasks through various multimodal models that specialize in processing specific data types such as text, images, sound and video. For example, you can use a combination of a smaller model such as the GPT-3.5 Turbo, with a multimodal large language model such as the GPT-4O, depending on the modality. This routing allows you to process more modalities by directing each type of data to a model that is best for them, increasing the overall performance and versatility of the system.

Expert direction for specialized domains

Another example is the direction of experts, where the challenges are directed to specialized models or “experts” based on a particular area or area to which the task is referred to. By implementing the company’s expert direction, the company ensures that different types of user queries are processed by the most suitable model or artificial intelligence service. For example, technical support questions may be directed to a model trained in technical documentation and support tickets, while general information requirements can be processed using a more general language model.

Expert routing can be especially useful in areas such as medicine, where different models can be fine -tuned to handle specific topics or images. Instead of relying on one large model, multiple smaller models such as Phi-3.5-mini-instruct and Phi-3.5-Vision-Instruct-each optimized for a defined area such as chat or vision, so each question is processed. The best expert model, thereby increasing the accuracy and relevance of the model output. This approach can improve response accuracy and reduce the cost of fine -tuning large models.

Car manufacturer

One example of this type of routing comes from a large car manufacturer. They have implemented the Phi model for fast processing of most basic tasks and the current routing of more complex tasks into a large language model such as the GPT-4O. The Phi-3 offline model can quickly handle most data processing locally, while the GPT Online provides computing power for larger and more complicated questions. This combination helps to use the cost-effective PHI-3 capabilities while ensuring effective processing of more complex and critical commercial questions.

Wise

Another example shows how to use cases specific to the industry from expert direction. Sage, leader in accounting, finance, human resources and payroll agenda for small and medium -sized enterprises (SMB), wanted to help its customers discover the efficiency of accounting processes and increase productivity through artificial intelligence services that could automate routine tasks and provide Real -time reports.

Recently, Sage has deployed Mistral, a commercially available large language model, and fine-tuned it with specific accounting data to solve the gaps in the GPT-4 used for their SAGE Copilot. This gentle tuning allowed the Mistral to better understand and respond to accounting questions, so it could more effectively categorize user questions and then direct them to the relevant agents or determinist systems. For example, while Mistral’s finished model can be struggled with the question of cash flows, the tuned version could direct the question precisely via Sage and domain data to ensure accurate and relevant response to users.

Scenario 2: Online and Offline Use

Online and offline scenarios allow the double benefits of local storage and processing of information using an offline model of artificial intelligence, as well as the use of online artificial intelligence model to access globally available data. In this setting, the organization could operate a local model for specific tasks on devices (such as CLATBOT Customer Services), while still having access to an online model that could provide data in a wider context.

Deploying a hybrid model for medical diagnosis

In the healthcare sector, artificial intelligence models could be deployed in a hybrid way to provide online and offline options. In one example, the hospital could use an offline model of artificial intelligence to process initial diagnostics and data processing locally on IoT devices. At the same time, it would be possible to use the online artificial intelligence model to access the latest medical research from cloud databases and medical magazines. While the offline model processes patient information locally, the online model provides globally available medical data. This online and offline combination helps to ensure that the staff can effectively perform patient evaluation while still benefiting from access to the latest progress in medical research.

Systems for Smart Homes with Local and Cloud AI

In intelligent home systems, more artificial intelligence models can be used to manage online and offline tasks. The offline model of artificial intelligence can be built into the home network to control basic functions such as lighting, temperature and safety systems, allowing faster response and allowing basic services to operate even in Internet outages. Meanwhile, the online artificial intelligence model can be used for tasks that require access to cloud services for updating and advanced processing, such as voice recognition and smart devices integration. This dual approach allows smart home systems to maintain basic operations independently while using cloud options for improved features and updates.

Scenario 3: Combination of specific and larger models

Companies that want to optimize cost savings could consider a combination of a small but powerful SLM for specific tasks such as the Phi-3, with a robust large language model. One way that this might work is the deployment of PHI-3-one of the family of powerful small language models Microsoft with groundbreaking power at low cost and low latency-in marginal computer scenarios or applications with stricter latency requirements, along with computing Power larger model as GPT.

In addition, the PHI-3 could serve as an initial filter or sorting system, processing straightforward questions and only escalating finer or more demanding requirements for GPT models. This stepped approach helps to optimize the efficiency of workflows and reduces unnecessary use of more expensive models.

By creating settings of complementary small and large models, businesses can potentially achieve cost -effective performance adapted to their specific cases of use.

Capacity

Answer Engine® Capacity with artificial intelligence gets accurate answers for users in seconds. Thanks to the use of top AI technologies, Capacity provides organizations of a personalized AI research assistant that can be easily scaled across all teams and departments. They needed a way to help unify different data sets and make information to their customers more easily accessible and understandable. The use of PHI has been able to provide businesses with effective AI knowledge management solutions that improve the availability of information, security and operating efficiency, saving time and problems to customers. After successful implementation of the Phi-3-Medium, Capacity now tests impatiently for the Phi-3.5-MOE for use in production.

Our commitment to trusted artificial intelligence

Organizations across sectors use the functions of Azure AI and Copilot to promote growth, increase productivity and create environmental environment.

We have committed to helping to use and build AI, which is trustworthy, which means it is safe, private and safe. We bring proven procedures and knowledge of decades of research and creating AI products in large to ensure top -class obligations and abilities that include our three pillars of security, privacy and security. Trusted artificial intelligence is only perhaps if you combine our obligations such as our Secure Future Initiative initiative and our principles of responsible artificial intelligence, with our product capabilities that with certainty to unlock artificial intelligence.

Start with Azure Ai Foundry

To learn more about increasing the reliability, security and performance of your cloud and AI investment, explore other sources below.

Read about the Phi-3-Mini that works better than some models with double size.