Large Language Models for Business: How to Make the Right Decision

Dr M Maruf Hossain, PhD, GAICD
Feb 22
12 min read

Updated: Feb 26

Large Language Models (LLMs) represent a transformative technology in the realm of generative artificial intelligence (AI). These machine learning models are designed to understand and generate text that mirrors human language, learning from vast amounts of text through rigorous self- and semi-supervised training.

LLMs have a broad range of applications, from generating text and automating workflows to sparking creative ideas and even writing software code. Some of the most prominent LLMs include OpenAI’s GPT series (such as GPT-3.5 and GPT-5), Google’s Gemini, Bard and PaLM, Meta’s LLaMA and BLOOM, Ernie 3.0 Titan, and Anthropic’s Claude.

Given the potential of LLMs to revolutionise business operations, many organisations are eager to integrate these models into their workflows. A common question is whether it is necessary to develop task-specific custom LLMs to improve performance. Integrating LLMs into business workflows requires careful planning and evaluation. This article is particularly relevant for organisations outside the technology sector that are contemplating integrating LLMs into their AI ecosystems. Before addressing the fundamental question of whether to create a custom LLM, it is crucial to understand the prerequisites for developing a custom LLM.

Originally published at LinkedIn Pulse on 13 January 2024.

Considerations before creating a custom LLM

Before embarking on the creation of a custom Large Language Model (LLM), several critical considerations should be considered:

Data volume. LLMs are trained on extensive text datasets, often ranging from hundreds of gigabytes to terabytes in size. For instance, OpenAI’s GPT-3 model was trained on over a trillion words gathered from various internet sources. Therefore, a custom LLM for the healthcare sector would require a large volume of data from sources such as medical journals, patient records, clinical trial data, and health websites.
Data quality. The performance of LLMs is directly influenced by the quality and quantity of the training data. Training LLMs with subpar datasets can lead to issues such as bias and overfitting. For example, a custom LLM for the legal sector would require high-quality data that is accurate, relevant, and up-to-date, and free of errors, inconsistencies, and duplication.
Data diversity. The training data should be collected from a variety of sources, including books, web pages, scientific papers, and online forums. This diversity enables the model to learn nuanced language patterns and semantics. For example, a custom LLM for the entertainment sector would require diverse data that covers different genres, styles, formats, and audiences, as well as cultural and historical references.
Data pre-processing. The creation of a custom LLM requires robust, flexible data pipelines that can perform tasks such as cleaning and normalisation, tokenisation and vectorisation, handling missing data, and data augmentation. For instance, a custom LLM for the education sector would require data preprocessing to ensure the data is suitable for the intended learning outcomes, including readability, complexity, and alignment with the curriculum.
Data security. It’s crucial to secure datasets containing sensitive information to protect user privacy and comply with industry regulations. For example, a custom LLM for the finance sector would require data security measures to safeguard data from unauthorised access, modification, or disclosure, and to adhere to relevant standards and policies.

Remember, curating and annotating a diverse training dataset that accurately represents the model’s domain is a critical aspect of implementing AI solutions.

In addition to data considerations, the time required to train a model is another significant factor:

Model size. Larger models with more parameters take longer to train. For example, training GPT-3, which has 175 billion parameters, on a single NVIDIA Tesla V100 GPU would take 288 years. For instance, a custom LLM for the travel sector would require a large model size to capture the complexity and variety of travel-related language, such as destinations, attractions, reviews, and bookings.
Computational resources. Training time can be significantly reduced by using more powerful hardware or by distributing the training process across multiple GPUs. For example, a custom LLM for the gaming sector would require substantial computational resources to train the model efficiently and effectively, as well as to support the high-performance demands of the gaming environment.
Training complexity. The complexity of the training process, including the model’s architecture and the optimisation algorithms used, can also impact the training time. For example, a custom LLM for the art sector would require a complex training process that incorporates elements such as creativity, originality, and aesthetics, as well as technical aspects such as style, colour, and composition.
Practical considerations. In practice, training a state-of-the-art LLM can take several months, even with substantial computational resources. For instance, a custom LLM for the social media sector would require practical considerations, such as trade-offs between speed and quality, data availability and accessibility, and model scalability and maintainability.

Remember, while training time is important, it’s also crucial to consider the quality of the trained model. Faster training doesn’t necessarily produce a better model, nor does slower training yield a superior one. The ultimate goal should be to balance training time with model performance.

For instance, a financial institution might consider creating a custom LLM to automate customer service interactions. However, they must weigh the benefits of a custom model (such as potentially better performance and greater control over the training data) against the costs (including data collection and annotation, computational resources, and training time). They might find that fine-tuning an existing LLM with their customer interaction data is a more cost-effective approach that still delivers high-quality results.

Advantages and disadvantages of custom LLM and vendor models

There are two main options to integrate LLMs into business workflows: using a vendor’s LLM or creating a custom LLM. Each option has its pros and cons, which are summarised below:

Some advantages of using vendor LLMs include:

Scalability. An organisation can leverage the vendor’s cloud-based services to train and deploy LLMs without worrying about computing resources or data storage. For instance, Google Cloud can provide scalable, reliable infrastructure for LLMs, including Cloud TPUs and Cloud Storage.
Cost efficiency. If one doesn’t have access to high-end hardware, using the cloud can be a more economical solution. For example, Amazon Web Services can offer pay-as-you-go pricing models for LLMs, such as AWS Lambda and Amazon S3.
Ease of use. Vendors provide pre-trained models that are ready to use or to fine-tune, saving time and resources. For example, OpenAI provides access to pre-trained models such as GPT-3.5 and GPT-4, which can be used or fine-tuned for various tasks.
Managed services. Vendors handle the setup, maintenance, security, and optimisation of the infrastructure, reducing the operational overhead. For instance, Microsoft Azure offers managed services for LLMs, including Azure Machine Learning and Azure Cognitive Services.
Continual updates. Vendors typically provide regular updates to their models, ensuring one benefits from the latest advancements. For example, Meta can provide continual updates to their models, such as LLaMA 2 and BLOOM, which incorporate the latest research and innovations.
Support. Vendors often provide support and resources to help one get the most out of their models. For instance, IBM can provide support and resources for LLMs, such as IBM Watson and IBM Cloud Pak for Data.

Whereas the disadvantages include:

Lack of control. The organisation has less control over a vendor’s model, including how it’s trained and on what data. A vendor’s model might not align with their business goals, values, or ethics, or reflect their domain-specific knowledge or terminology.
Potential for vendor lock-in. Switching vendors can be difficult and costly, especially if one relies on their proprietary models or services. For example, a vendor might change their pricing, policies, or features, or might discontinue their models or services, which can affect their business continuity or performance.
Cost. There may be ongoing costs associated with licensing a vendor’s model, which can vary based on usage, features, or quality. For example, a vendor might charge based on the number of requests, the amount of data, the level of accuracy, or the complexity of the task.

Some advantages of using custom LLMs include:

Customisation. The organisation can tailor the model to its specific needs, leading to better performance on specific tasks. For instance, one can train the model on their data to capture their domain-specific knowledge, terminology, and preferences.
Transparency and flexibility. Open source LLMs provide transparency and flexibility, allowing full control over the data and the model. For example, one can use open-source LLMs such as Hugging Face Transformers or TensorFlow Text, which allow one to modify, extend, or improve the model as desired.
Cost savings. While the initial investment can be high, owning the model is less expensive in the long run because there are no ongoing licensing fees. So, one can avoid paying for the vendor’s model or services and pay only for infrastructure costs, which can be reduced by using efficient hardware or software.
Added features and community contributions. One can add features to the LLM to benefit their specific use case and leverage community contributions. They can add features such as sentiment analysis, summarisation, or translation to the LLM, and use community-contributed models or datasets from platforms like GitHub or Kaggle.
Data security. One can ensure the security of their data, which is particularly important if the model is trained on sensitive or proprietary information. For example, they can encrypt, anonymise, or obfuscate their data, and use secure protocols and platforms to store and access the data.

Some disadvantages include:

Resource-intensive. Training LLMs requires significant computational resources and expertise, which can be challenging to acquire and maintain. For instance, one might need to invest in high-end hardware, such as GPUs or TPUs, or hire skilled professionals, such as data scientists or machine learning engineers, to train LLMs.
Maintenance. The organisation is responsible for maintaining and updating the model, which can be time-consuming and complex. For example, they might need to monitor, debug, or retrain the model, or keep up with the latest research and developments in LLMs.
Time-consuming. Training and fine-tuning an LLM can be time-consuming, depending on the model’s size and complexity, as well as the data’s size and complexity. For example, it might take several weeks or months to train a state-of-the-art LLM like GPT-4, which has 2.5 trillion parameters.

The choice between creating a custom LLM or using a vendor model depends on their specific needs, resources, and expertise. It’s important to carefully consider these factors before making a decision.

Challenges to create a custom LLM

Organisations may encounter several challenges when creating their custom LLMs:

Lack of expertise. Developing, training, and maintaining LLMs requires specialised skills in machine learning, natural language processing, and data science, which organisations may lack. Ergo, they may need to hire external consultants or train existing staff to acquire the necessary expertise for LLMs.
Resource-intensive. Training LLMs requires significant computational resources that companies may not have. Additionally, maintaining and updating these models requires ongoing investment. So, they may need to purchase or rent high-end hardware, such as GPUs or TPUs, or use cloud-based services, which can incur high costs for LLMs.
Data privacy and security. Handling sensitive data for training LLMs can pose data privacy and security risks that organisations may not be prepared for. For example, a company may need to implement data protection measures, such as encryption, anonymisation, or obfuscation, or comply with data regulations, such as GDPR or CCPA, for LLMs.
Time-consuming. Training and fine-tuning an LLM can be time-consuming, which many companies may not be able to afford. Therefore, they may need to allocate significant time and resources to LLMs, which could divert them from their core business operations.
Language limitations. It has been difficult to develop AI systems in languages other than English due to the resource gap. This could be a barrier for companies operating in non-English-speaking regions. For example, a multi-national organisation may need to source or create data in other languages or use multilingual models, which can be challenging or costly for LLMs.
Generalisation issues: Generalised AI is trained on vast, diverse datasets, enabling it to handle a wide array of tasks reasonably well. However, they may not perform as well on specific, complex enterprise operations.

Therefore, for many organisations, it may be more practical and cost-effective to use vendor-provided LLMs, which are ready to use, regularly updated, and supported.

LLMs for information retrieval

For many organisations, the first use cases for LLMs involve organisation-specific content through a natural language-based interface or a chatbot. LLMs can indeed be used as an information source, but there are some important factors to keep in mind:

Accuracy. LLMs are trained on vast amounts of data, but they cannot verify the accuracy or timeliness of the information they generate. They can sometimes produce incorrect or outdated information about a product, service, or policy, which could mislead or confuse customers or employees.
Context. LLMs generate text based on patterns they’ve learned from their training data. They do not understand context in the same way humans do. This means they might not fully grasp the nuances of certain topics or questions. Therefore, an LLM might generate irrelevant or inappropriate information in response to a specific query, which could frustrate or offend users or stakeholders.
Bias. LLMs can unintentionally propagate biases present in their training data. This can lead to biased or unfair information that reflects stereotypes, prejudices, or discrimination, harming the organisation’s reputation or values.
Lack of common sense. Despite their impressive capabilities, LLMs often lack common-sense reasoning. They might produce outputs that, while grammatically correct, are nonsensical or illogical and contradict facts, common knowledge, or common sense, undermining the organisation’s credibility or trustworthiness.
Data privacy and security. If an LLM is trained on sensitive or proprietary data, using it as an information source could potentially expose this data. This could violate data privacy and security regulations or policies, or cause legal or ethical issues.
Speed of change. The speed at which organisational data, policy and product changes, creating or fine-tuning an LLM may not match. Hence, it is always a good idea to keep the source of truth separated from the LLM. For example, an LLM might generate outdated or inconsistent information that does not reflect the current state of the organisation, which could cause confusion or errors.

While LLMs can be a valuable tool for generating text and providing information, they should not be the sole source. It’s important to cross-verify the information from other reliable sources, such as databases, documents, or experts.

LLMs can be effectively used for enterprise information retrieval in several ways:

Leveraging LLM APIs. The first way to use LLMs in an enterprise context is to make an API call to a model provided as a service. This approach has several advantages, including a low barrier to entry, access to more sophisticated models, and fast responses. However, it may also be inappropriate for certain enterprise applications due to data residency and privacy concerns, potentially higher costs, and dependency on the service provider. For example, an organisation might use an LLM API to generate text for marketing campaigns, customer service, or internal communications, but they might also need to consider the data sovereignty, security, and cost implications of using a third-party service. Moreover, this might not be enough to overcome the challenges of using vendor models.
Running an open-source model in a managed environment. The second option is to download and run an open-source model in an environment managed by the organisation. This gives the organisation full control over the data and the model, ensuring data privacy and security. For instance, an organisation might run an open-source model in its own cloud or on-premises infrastructure, allowing it to customise, modify, or improve the model as needed and to secure its data against unauthorised access or disclosure. The vendor models’ challenges might still apply to this technique.
Retrieval-Augmented Generation (RAG): Leverage external knowledge sources to enhance responses and retrieve specific information from organisational databases. For example, the model can access relevant documents in a database and use this information to formulate responses. This approach stands as the most favoured method for extracting precise information from the organisational knowledge base and articulating it in natural language. Notably, Bing.ai uses a similar strategy: initial comprehension of user queries, subsequent online searches for relevant content, and use only of identified pages to construct a comprehensive and accurate response.
Connect LLMs to external data. Cross-reference responses with trusted external databases to enhance answer verification. For example, an organisation might connect an LLM to external data sources, such as Wikipedia, news articles, or databases, which can enrich the information generated by the LLM and improve its quality and relevance.
Pairing LLMs with high-performance databases. LLMs can be paired with highly scalable, high-performance databases on the back end that take queries and analytics code generated by LLMs, scan millions or billions of records, and translate the data into insights. For instance, an organisation might pair an LLM with a high-performance database, such as Snowflake, which can handle large-scale data analysis and provide fast, accurate insights for business intelligence, decision-making, or reporting.
Fine-tuning for specific tasks. LLMs can be fine-tuned on specific tasks or domains, such as legal, medical, or financial, to improve their performance. This requires a larger volume of data and more computational resources. However, many of these tasks are common and are procured from vendors, such as GitHub Copilot to assist with writing code or Pega GenAI to create workflows faster.

Remember, the best way to use LLMs for enterprise information retrieval depends on the organisation’s specific needs and resources. It’s important to carefully consider these factors before making a decision.

Concluding remark

Creating a custom LLM demands a substantial investment in high-quality, domain-specific training data, along with significant computational resources and expertise in machine learning and natural language processing. The decision to pursue a custom LLM should be made with care, considering the organisation’s unique requirements, available resources, and potential return on investment. Amidst the exciting era of AI in business, LLMs present a significant opportunity for innovation and efficiency enhancement.

Alternatively, leveraging pre-trained LLMs and fine-tuning them for specific tasks can prove more efficient and cost-effective. For instance, healthcare and law firms can automate tasks like patient communication and legal document drafting. Yet it’s crucial to augment this approach with validation methods such as RAG or external sources to ensure accurate responses. Another strategy involves using LLMs solely to understand user queries and formulate responses, while obtaining information from the organisation’s trusted source of truth, thereby eliminating the need for model fine-tuning.