Multimodal AI Applications | 15 Examples

Afact that holds true for all types of AI technology is that AI is evolving – and it’s evolving at an insanely rapid rate.  Gone are the days of AI tech that solely relied on one form of textual input – Enter Multimodal AI. This (not so) new kid in the block has taken GenerativeAI on a different level. Multimodal AI systems generate accurate human-like responses by combining information from various sources like images, sound, and text to create a more nuanced understanding of the context. Think LLMs with multiple modes of data input. Or simply, ChatGPT-Sora-Midjourney on steroids. Here, we will be looking at Top 15 use cases of Multimodal AI Applications.

Multimodal AI Applications – 15 use cases

 

 

1. Oxford Cancer Biomarkers: IBM PowerAI Vision

Oxford Cancer Biomarkers is the pioneer in personalized medicine in the field of oncology. Colorectal Cancer diagnosis and treatment are cost-expensive and a time-consuming process. Integration of AI systems in healthcare can save thousands of lives and billions of money. Working with Meridian IT to deploy IBM PowerAI vision running on IBM Power Systems, has allowed OCB to accelerate the analysis of digital images of tumors to determine the possibility of recurrence of colorectal cancer. This has significantly reduced the unnecessary treatment with chemotherapy which has its side effects.

The advancement in research using AI systems now requires a few hours rather than months to predict the relapse risk. It has set a new standard of care, to enhance patient outcomes and improve efficiency. It can be easily scaled to deploy the solution across the globe.

 

2.. BBC Studios: IBM Turbonimic

In 2014, when the Server and Storage Manager, Porsche Waddell joined BBC Studios, he had a lean team and environment running at over 95% capacity and marred by frequent application performance issues. This negatively impacted BBC’s business leading to failure in its mission to consistently deliver high-quality content to inform, educate, and inspire viewers across the globe. Since the existing tools could not accurately identify the root cause, it created turbulence in the team. The company was in dire need of a cost-effective solution to ensure the application’s performance.

The IBM Turbonomic Application Resource Management helped combat these troubles by managing its 1000 virtual machines. As a result, within a month, the company reclaimed 228 GB of memory. Turbonomic is efficient at not only providing specific actions to take but can also predict the impact of each action before being executed. This significantly reduced the end-user complaints and eliminated the downtime in processing.

 

3. Canon Marketing Japan: IBM FlashSystem Storage

Canon Marketing Japan Inc. is a multinational corporate company specializing in optical, imaging, and industrial products like lenses, cameras, medical equipment, scanners, printers, and Semiconductors manufacturing equipment. The company was looking for a solution to boost employee productivity and satisfaction by improving the response times for vital information systems. Deploying a software-defined storage infrastructure of the IBM technology enabled 10 times faster data processing.
As a result, the speed of online processing was enhanced, and the daily wait times were significantly by 554 hours across its workforce. It also reduced the administrative work and maximized the value of its IT investments.

 

4.. Google: TensorFlow

Google can provide the most relevant information to us using the TensorFlow AI system. The TF-Ranking is quick and simple to use to create high-quality ranking models. This unified framework provides ML researchers, practitioners, and enthusiasts with the ability to evaluate and choose among the array of different ranking models in a single library.
The TF-Ranking is based on a novel scoring mechanism where multiple web pages can be scored jointly. The main challenge in multi-page scoring is the difficulty of inference where the pages need to be grouped and scored in sub-groups. The TF-Ranking provides a List-In-List-Out API to wrap all the logic in exported TF models. This ensures ranking efficiency.

5. Twitter : TensorFlow

TensorFlow has unlocked significant benefits for Twitter timelines on multiple fronts. It has improved model quality and the quality of timelines for Twitter users. This has reduced training and model iteration time.
The home timeline is the default starting point for most Twitter users. The function of the timeline is to update the public about the most relevant tweets for them. According to the metrics and survey of Twitter, users are satisfied with the display of the best tweets first on the timeline. In addition to this, ranking the timeline to show the most relevant tweets has engaged users.
A candidate’s tweet is scored by the relevance model, TensorFlow to predict its relevancy to each user to rank a tweet on the timeline. The model used thousands of features from its three entities, Tweet, the Author, and the User.

 

6. Amazon Ads: Pytorch, TorcheServe and AWS Inferentia

Amazon Ads deploys PyTorch, TorchServe, and AWS Inferentia to minimize the inference costs by 71% and drive scale-out. The ads help companies to build their brand and connect with shoppers. These ads are visible across all online platforms including websites, apps, and streaming TV content in more than 15 countries. Business and brands of all sizes, which includes registered sellers, vendors, book vendors, app developers, and more can upload their creative ads. These ads include images, video audio, and products sold on Amazon.

The ads must comply with content guidelines to promote an accurate, safe, and pleasant shopping experience. To ensure the ads meet required policies and standards, machine learning models are used. PyTorch is used to build computer vision along with natural language processing models to automatically flag potentially non-compliant ads. PyTorch is highly intuitive, flexible, and user-friendly. Deploying this model on AWS Inferentia, instead of GPU-based instances has reduced inference latency by 30% and cost by 71%.

 

7. Coca Cola: TensorFlow

Coca-Cola has deployed the use of TensorFlow to achieve a long-sought proof-of-purchase capability. The product code recognition platform has supported more than a dozen promotions, resulting in over 180,000 scanned codes.
It has proved beneficial to the company for two reasons:
• Seamless proof-of-purchase in a timely fashion that corresponds to a mobile-first marketing platform.
• Coke could save millions of dollars by avoiding updating the printers in production lines to support higher-fidelity fonts.

Such a Machine Learning model with TensorFlow has enabled Mobile Proof-of-purchase at Coca-Cola.

 

8. BuzzFeed: Uncubed

BuzzFeed is a media company that entertains millions of people every day through its engaging and informative content. As the number of audiences is large, so is the number of applicants waiting to be hired at BuzzFeed. BuzzFeed’s assessment of the ideal candidate for a job was the biggest challenge faced by BuzzFeed. It teamed up with Uncubed to develop the AI-driven solution based on IBM Watson Candidate Assistant. This enabled pinpointing top candidates and encouraging them to apply for the role.
This resulted in 64% more applicants, the efficiency of the recruitment process was enhanced, and the risk of mis-hires was reduced.

 

 

9. CoStar: Using Amazon Rekonginition- content moderation

Costar is a world-leading real estate Company that provides information, analytics, and news to clients to make smart investments and lead the pack. CoStar Group can deliver its content efficiently by using AWS Content Moderation and Amazon Rekognition. AWS can automatically moderate images and videos that are uploaded to the CoStar platform. More than 150,000 images and videos are uploaded which must be ensured to be appropriate. Manual analysis of such a big chunk of data is not feasible.

The Amazon Rekongition Content Moderation API handles this task well by ensuring the uploaded images and videos are of high quality by mass scanning and imagery classification. It can detect unwanted and toxic content in the images with text. Thus, saving time and increasing productivity while reducing the infrastructure cost.

 

10. Software Colombia: Amazon Rekonginition- Identity verification

Software Colombia creates a powerful identity verification system using Amazon Web Services. Digital businesses conduct digital transactions where identity management is of chief concern. Software Colombia needed an accurate and robust biometric facial recognition system to verify user identity by analyzing facial features and matching them against existing records. Its solution is named eLogic Biometrics, designed, and prototyped with the AWS envision engineering team can mitigate identity spoofing attacks.

The biometric face recognition and authentication mechanism has reduced the cost and risk of fraud on business-critical processes. It has enhanced the user authentication process and has established secure electronic communication.

 

11. American Airlines: IBM Cloud

American Airlines is working to improve the customer experience to ace the race with its competitors. American Airlines faced a challenge to meet the customer’s appetite for instant information and services. It needed a new technology platform and a new approach to development for delivering digital self-service. IBM has helped solve these problems while simultaneously transforming them into a cloud-based microservices architecture.

IBM Cloud has innovated the world’s largest airline to generate fast responses while adapting to changing customer needs. It has resulted in improved operational reliability, productivity, and quick customer response time. It allows fast development and release of new apps as well.

 

12. Mitsubishi Chemical: IBM Quantum

Mitsubishi Chemical is the largest chemical company based in Japan. The lithium atom is the lightest atom on the periodic table. Its properties are well-exploited for generating energy when combined with other elements. Such a blend of lightweight and big energy potential makes it a star of 21st-century battery chemistry. Mitsubishi Company, like other members of the IBM Quantum Network, has its budget devoted to molecular simulations. 

Quantum computing helps speed up the research. Mitsubishi Chemical, along with Keio University and IBM Quantum is working to explore lithium-oxygen potential as an energy source by employing new algorithms of quantum computing.

Such work in R&D has already produced quantitatively correct computational results of complex chemical reactions in the discharge process of lithium-oxygen batteries. Besides this, the Quantum Network promotes viewing molecular fundamentals through a new lens, to mine new insights and phenomena which are less known or expected.

 

 

13. Dream 11: using Amazon Web Services

Dream 11 is an Indian fantasy sports platform that allows its users to create a team of real players for the upcoming match and compete with other fantasy sports enthusiasts. The fantasy sports market had an annual compound growth rate of 32% last year. Such sites are favorites of cyber attackers that have a large digital customer base.

The AWS Identity and Access Management (IAM) takes control of the employee access based on the principle of least privilege. Other than this, AWS has supported its user base expansion from 2 million to more than 100 million in the last 4 years. It has ensured 99.99% uptime with single-digit millisecond latency. AWS protects users’ data against any external and internal intrusions. It can detect fraud and collusion attempts as well. In addition to this, it provides personalized recommendations to users based on past activities and in every 2-3 days launches small app enhancements.

 

14. Cargills Bank: IBM QRadar

Cargills Bank Ltd. is a licensed commercial bank in Sri Lanka. As financial institutions pose a great threat from cybercrime, the bank wanted to enhance its cyber security. The IBM QRadar Advisor with Watson can safeguard customers with cognitive security. Analysts can easily examine a broad range of threat data and get actionable insights to make the right decision in minutes. By using the IBM QRadar SIEM with Watson, it is possible to detect cyber threats and classify them at early stages.

Such a proactive approach has helped to speed up the process of detection and accurate identification of cyber threats and alerts. It guards against sophisticated threat incidents by following stronger preventive protocols. Also, has helped to transform millions of security documents into actionable intelligence relevant to specific types of threats.

 

 

15. Portland State University: IBM Cognos Analytics

Portland State University inspires students to remain ahead of the curve by providing an enriching education. It has more than 27k students enrolled in more than 200 degree programs. This large, urban university needs detailed analysis to ensure that every department is functioning smoothly. However, how to ensure that the university staff has access to the best analytical tools was the main challenge. It previously used tools provided by IBM Watson and chose to upgrade them to IBM Cognos Analytics.
The tool unlocked new methods of interrogating data, like geolocation analysis. The users can upload and analyze their datasets as well. Thus, producing a smooth, low-risk upgrade path. IBM Cognos Analytics has enhanced the functionality and usability of its business intelligence platform.

 

FAQ

Multimodal AI leverages diverse data types like text, images, and video. This allows a better data representation by understanding the context.

Multimodal data enhances contextual understanding, thus resulting in accurate predictions and informed decisions. Simply put, it is akin to having multiple sense organs instead of one to interpret data and draw conclusions.

Earlier versions of ChatGPT are not. ChatGPT 4.0 is a multimodal AI.

Similar Posts