Archive for the ‘Machine Learning’ Category
Code generation using Code Llama 70B and Mixtral 8x7B on Amazon SageMaker | Amazon Web Services – AWS Blog
Posted: June 11, 2024 at 2:48 am
In the ever-evolving landscape of machine learning and artificial intelligence (AI), large language models (LLMs) have emerged as powerful tools for a wide range of natural language processing (NLP) tasks, including code generation. Among these cutting-edge models, Code Llama 70B stands out as a true heavyweight, boasting an impressive 70 billion parameters. Developed by Meta and now available on Amazon SageMaker, this state-of-the-art LLM promises to revolutionize the way developers and data scientists approach coding tasks.
Code Llama 70B is a variant of the Code Llama foundation model (FM), a fine-tuned version of Metas renowned Llama 2 model. This massive language model is specifically designed for code generation and understanding, capable of generating code from natural language prompts or existing code snippets. With its 70 billion parameters, Code Llama 70B offers unparalleled performance and versatility, making it a game-changer in the world of AI-assisted coding.
Mixtral 8x7B is a state-of-the-art sparse mixture of experts (MoE) foundation model released by Mistral AI. It supports multiple use cases such as text summarization, classification, text generation, and code generation. It is an 8x model, which means it contains eight distinct groups of parameters. The model has about 45 billion total parameters and supports a context length of 32,000 tokens. MoE is a type of neural network architecture that consists of multiple experts where each expert is a neural network. In the context of transformer models, MoE replaces some feed-forward layers with sparse MoE layers. These layers have a certain number of experts, and a router network selects which experts process each token at each layer. MoE models enable more compute-efficient and faster inference compared to dense models.
Key features and capabilities of Code Llama 70B and Mixtral 8x7B include:
Amazon SageMaker, a fully managed machine learning service, provides a seamless integration with Code Llama 70B, enabling developers and data scientists to use its capabilities with just a few clicks. Heres how you can get started:
The following figure showcases how code generation can be done using the Llama and Mistral AI Models on SageMaker presented in this blog post.
You first deploy a SageMaker endpoint using an LLM from SageMaker JumpStart. For the examples presented in this article, you either deploy a Code Llama 70 B or a Mixtral 8x7B endpoint. After the endpoint has been deployed, you can use it to generate code with the prompts provided in this article and the associated notebook, or with your own prompts. After the code has been generated with the endpoint, you can use a notebook to test the code and its functionality.
In this section, you sign up for an AWS account and create an AWS Identity and Access Management (IAM) admin user.
If youre new to SageMaker, we recommend that you read What is Amazon SageMaker?.
Use the following hyperlinks to finish setting up the prerequisites for an AWS account and Sagemaker:
With the prerequisites complete, youre ready to continue.
The Mixtral 8x7B and Code Llama 70B models requires an ml.g5.48xlarge instance. SageMaker JumpStart provides a simplified way to access and deploy over 100 different open source and third-party foundation models. In order to deploy an endpoint using SageMaker JumpStart, you might need to request a service quota increase to access an ml.g5.48xlarge instance for endpoint use. You can request service quota increases through the AWS console, AWS Command Line Interface (AWS CLI), or API to allow access to those additional resources.
While Code Llama excels at generating simple functions and scripts, its capabilities extend far beyond that. The models can generate complex code for advanced applications, such as building neural networks for machine learning tasks. Lets explore an example of using Code Llama to create a neural network on SageMaker. Let us start with deploying the Code Llama Model through SageMaker JumpStart.
Additional details on deployment can be found in Code Llama 70B is now available in Amazon SageMaker JumpStart
Note: This blog post section contains code that was generated with the assistance of Code Llama70B powered by Amazon Sagemaker.
Let us walk through a code generation example with Code Llama 70B where you will generate a transformer model in python using Amazon SageMaker SDK.
Prompt:
Response:
Code Llama generates a Python script for training a Transformer model on the sample dataset using TensorFlow and Amazon SageMaker.
Code example: Create a new Python script (for example, code_llama_inference.py) and add the following code. Replace
Save the script and run it:
python code_llama_inference.py
The script will send the provided prompt to the Code Llama 70B model deployed on SageMaker, and the models response will be printed to the output.
Example output:
Input
> Output
You can modify the prompt variable to request different code generation tasks or engage in natural language interactions with the model.
This example demonstrates how to deploy and interact with the Code Llama 70B model on SageMaker JumpStart using Python and the AWS SDK. Because the model might be prone to minor errors in generating the response output, make sure you run the code. Further, you can instruct the model to fact-check the output and refine the model response in order to fix any other unnecessary errors in the code. With this setup, you can leverage the powerful code generation capabilities of Code Llama 70B within your development workflows, streamlining the coding process and unlocking new levels of productivity. Lets take a look at some additional examples.
Lets walk through some other complex code generation scenarios. In the following sample, were running the script to generate a Deep Q reinforcement learning (RL) agent for playing the CartPole-v0 environment.
The following prompt was tested on Code Llama 70B to generate a Deep Q RL agent adept in playing CartPole-v0 environment.
Prompt:
Response: Code Llama generates a Python script for training a DQN agent on the CartPole-v1 environment using TensorFlow and Amazon SageMaker as showcased in our GitHub repository.
In this scenario, you will generate a sample python code for distributed machine learning training on Amazon SageMaker using Code Llama 70B.
Prompt:
Response: Code Llama generates a Python script for distributed training of a deep neural network on the ImageNet dataset using PyTorch and Amazon SageMaker. Additional details are available in our GitHub repository.
Compared to traditional LLMs, Mixtral 8x7B offers the advantage of faster decoding at the speed of a smaller, parameter-dense model despite containing more parameters. It also outperforms other open-access models on certain benchmarks and supports a longer context length.
Additional details on deployment can be found in Mixtral-8x7B is now available in Amazon SageMaker JumpStart.
Hyperparameters are external configuration variables that data scientists use to manage machine learning model training. Sometimes called model hyperparameters, the hyperparameters are manually set before training a model. Theyre different from parameters, which are internal parameters automatically derived during the learning process and not set by data scientists. Hyperparameters directly control model structure, function, and performance.
When you build complex machine learning systems like deep learning neural networks, exploring all the possible combinations is impractical. Hyperparameter tuning can accelerate your productivity by trying many variations of a model. It looks for the best model automatically by focusing on the most promising combinations of hyperparameter values within the ranges that you specify. To get good results, you must choose the right ranges to explore.
SageMaker automatic model tuning (AMT) finds the best version of a model by running many training jobs on your dataset. To do this, AMT uses the algorithm and ranges of hyperparameters that you specify. It then chooses the hyperparameter values that creates a model that performs the best, as measured by a metric that you choose.
Note: This blog post section contains code that was generated with the assistance of Mixtral 8X7B model, powered by Amazon Sagemaker.
Prompt:
Response:
There are instances where users need to convert code written in one programing language to another. This is known as a cross-language transformation task, and foundation models can help automate the process.
Prompt:
Response:
This Python code uses a built-in list data structure instead of the Java ArrayList class. The code above is more idiomatic and efficient in Python.
The AWS Cloud Development Kit (AWS CDK) is an open-source software development framework for defining cloud infrastructure as code with modern programming languages and deploying it through AWS CloudFormation.
The three-tier architecture pattern provides a general framework to ensure decoupled and independently scalable application components can be separately developed, managed, and maintained (often by distinct teams). A three-tier architecture is the most popular implementation of a multi-tier architecture and consists of a single presentation tier, logic tier, and data tier:
Prompt:
Response:
The following are some additional considerations when implementing these models:
Delete the model endpoints deployed using Amazon SageMaker for Code Llama and Mistral to avoid incurring any additional costs in your account.
Shut down any SageMaker Notebook instances that were created for deploying or running the examples showcased in this blog post to avoid any notebook instance costs associated with the account.
The combination of exceptional capabilities from foundation models like Code Llama 70B and Mixtral 8x7B and the powerful machine learning platform of Sagemaker, presents a unique opportunity for developers and data scientists to revolutionize their coding workflows. The cutting-edge capabilities of FMs empower customers to generate high-quality code, infill missing sections, and engage in natural language interactions, all while using the scalability, security, and compliance of AWS.
The examples highlighted in this blog post demonstrate these models advanced capabilities in generating complex code for various machine learning tasks, such as natural language processing, reinforcement learning, distributed training, and hyperparameter tuning, all tailored for deployment on SageMaker. Developers and data scientists can now streamline their workflows, accelerate development cycles, and unlock new levels of productivity in the AWS Cloud.
Embrace the future of AI-assisted coding and unlock new levels of productivity with Code Llama 70B and Mixtral 8x7B on Amazon SageMaker. Start your journey today and experience the transformative power of this groundbreaking language model.
Shikhar Kwatrais an AI/ML Solutions Architect at Amazon Web Services based in California. He has earned the title of one of the Youngest Indian Master Inventors with over 500 patents in the AI/ML and IoT domains. Shikhar aids in architecting, building, and maintaining cost-efficient, scalable cloud environments for the organization, and supports the GSI partners in building strategic industry solutions on AWS. Shikhar enjoys playing guitar, composing music, and practicing mindfulness in his spare time.
Jose Navarro is an AI/ML Solutions Architect at AWS based in Spain. Jose helps AWS customersfrom small startups to large enterprisesarchitect and take their end-to-end machine learning use cases to production. In his spare time, he loves to exercise, spend quality time with friends and family, and catch up on AI news and papers.
Farooq Sabiris a Senior Artificial Intelligence and Machine Learning Specialist Solutions Architect at AWS. He holds PhD and MS degrees in Electrical Engineering from the University of Texas at Austin and an MS in Computer Science from Georgia Institute of Technology. He has over 15 years of work experience and also likes to teach and mentor college students. At AWS, he helps customers formulate and solve their business problems in data science, machine learning, computer vision, artificial intelligence, numerical optimization, and related domains. Based in Dallas, Texas, he and his family love to travel and go on long road trips.
Here is the original post:
iPadOS 18’s Smart Script uses machine learning to make your handwriting less horrible – Yahoo Movies Canada
Posted: at 2:48 am
Last month, Apple's tablets got a major revamp with the arrival of the M4 chip, two size options for the iPad Air, updates to the Magic Keyboard and a new iPad Pro packing a fancy Tandem OLED display. And now at WWDC 2024, Apple is looking to flesh out the iPad's software with the introduction of Apple Intelligence and a number of fresh features heading to iPadOS 18, which is due out sometime later this year.
To start, iPadOS is getting deeper customization options for your home screen including the ability to put app icons pretty much wherever you want. Apple's Control Center has also been expanded with support for creating multiple lists and views, resizing and rearranging icons and more. There's also a new floating tab bar that makes it easy to navigate between apps, which can be further tuned to remember your favorites. Next, SharePlay is getting the ability to draw diagrams on someone else's iPad or control someone else's device remotely (with permission) for times like when you need to help troubleshoot.
After years of requests, the iPad is also getting its own version of the Calculator app, which includes a new Math Notes feature that supports the Apple Pencil and the ability to input handwritten formulas. Math Notes will even update formulas in real time or you can save them in case you want to revisit things later. Alternatively, the Smart Script tool in the Notes app uses machine learning to make your notes less messy and easier to edit.
General privacy is also being upgraded with a new feature that lets you lock an app. This allows a friend or family member to borrow your device without giving them full access to everything on your tablet. Alternatively, theres also a new hidden apps folder so you can stash sensitive software in a more secretive way.
In Messages, Tapbacks are now compatible with all your emoji. Furthermore, you'll be able to schedule messages or send texts via satellite in case you aren't currently connected to Wi-Fi or a cellular network. Apple even says messages sent using satellite will feature end-to-end encryption.
The Mail and Photos apps are also getting similarly big revamps. Mail will feature new categorizations meant to make it easier to find specific types of offers or info (like plane flights). Meanwhile, the Photos app will sport an updated UI that will help you view specific types of images while hiding things like screenshots. And to better surface older photos and memories, there will be new categories like Recent Days and People and Pets to put similar types of pics all in a single collection.
Audio controls on iPads is also getting a boost with a new ability for Siri to understand gestures for Yes and No by either shaking or nodding your head while wearing AirPods. This should make it easier to provide Apple's digital assistant with simple responses in areas like a crowded bus or quiet waiting room where you might be uncomfortable talking aloud.
However, the biggest addition this year is that alongside all the iPad-specific features, Apples tablet OS is also getting Apple Intelligence. This covers many of the companys new AI-powered features like the ability to create summaries of websites, proofread or rewrite emails or even generate new art based on your prompts.
Apple says that to make its AI more useful, features will be more personalized and contextual. That said, to help protect your privacy and security, the company claims it wont build profiles or sell data to outside parties. Generally, Apple says it will use on-device processing for most of its tools, though some features require help from the cloud.
As its iconic digital assistant, Siri is getting a big refresh via Apple Intelligence too. This includes better natural language recognition and the ability to understand and remember context from one query to another. Siri will also be able to help you use your device, allowing you to ask your tablet how to perform certain tasks, search for files or control apps and features using your voice.
Some examples of what Apple Intelligence can do is highlight priority emails and put them at the top of your inbox so you don't miss important messages or events. Or if you're feeling more creative, you can use AI to create unique emoji (called Genmoji). And in photos, Apple Intelligence can help you edit images with things like the Clean Up tool. And for those who want the freedom to use other AI models, Apple is adding the option to integrate other services, the first of which will be Chat GPT.
Finally, other minor updates including a new Passwords app for stashing credentials across apps and websites, a new dedicated Game Mode with personalized spatial audio, expanded hiking results in Apple Maps and a new eye-tracking feature for improved accessibility.
Catch up here for all the news out of Apple's WWDC 2024.
The rest is here:
AI better predicts back surgery outcomes – Futurity: Research News
Posted: at 2:48 am
Share this Article
You are free to share this article under the Attribution 4.0 International license.
Researchers who had been using Fitbit data to help predict surgical outcomes have a new method to more accurately gauge how patients may recover from spine surgery.
Using machine-learning techniques, researchers worked to develop a way to more accurately predict recovery from lumbar spine surgery.
The results, published in the journal Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, show that their model outperforms previous models to predict spine surgery outcomes.
This is important because in lower back surgery and many other types of orthopedic operations, outcomes vary widely depending on the patients structural disease but also on varying physical and mental health characteristics across patients.
Surgical recovery is influenced by both physical and mental health before the operation. Some people may have excessive worry in the face of pain that can make pain and recovery worse. Others may suffer from physiological problems that worsen pain. If physicians can get a heads-up on the various pitfalls a patient faces, they can better tailor treatment plans.
By predicting the outcomes before the surgery, we can help establish some expectations and help with early interventions and identify high risk factors, says first author Ziqi Xu, a PhD student in the lab of Chenyang Lu, a professor in the McKelvey School of Engineering at Washington University in St. Louis.
Previous work in predicting surgery outcomes typically used patient questionnaires given once or twice in clinics, capturing a static slice of time.
It failed to capture the long-term dynamics of physical and psychological patterns of the patients, Xu says. Prior work training machine-learning algorithms focused on just one aspect of surgery outcome but ignored the inherent multidimensional nature of surgery recovery, she adds.
Researchers have used mobile health data from Fitbit devices to monitor and measure recovery and compare activity levels over time. But the new research has shown that activity data, plus longitudinal assessment data, is more accurate in predicting how the patient will do after surgery, says Jacob Greenberg, an assistant professor of neurosurgery at the School of Medicine.
The current work offers a proof of principle showing that, with multimodal machine learning, doctors can see a more accurate big picture of the interrelated factors that affect recovery. Before beginning this work, the team first laid out the statistical methods and protocol to ensure they were feeding the artificial intelligence system the right balanced diet of data.
Previously, the team had published work in the journal Neurosurgery showing for the first time that patient-reported and objective wearable measurements improve predictions of early recovery compared to traditional patient assessments.
In addition to Greenberg and Xu, Madelynn Frumkin, a PhD student studying psychological and brain sciences in Thomas Rodebaughs laboratory, was a co-first author on that work. Wilson Zack Ray, a professor of neurosurgery at the School of Medicine, was co-senior author, along with Rodebaugh and Lu. Rodebaugh is now at the University of North Carolina at Chapel Hill.
In that research, they show that Fitbit data can be correlated with multiple surveys that assess a persons social and emotional state. They collected that data via ecological momentary assessments (EMAs) that employ smartphones to give patients frequent prompts to assess mood, pain levels, and behavior multiple times throughout day.
We combine wearables, EMA, and clinical records to capture a broad range of information about the patients, from physical activities to subjective reports of pain and mental health, and to clinical characteristics, Lu says.
Greenberg adds that state-of-the-art statistical tools that Rodebaugh and Frumkin have helped advance, such as Dynamic Structural Equation Modeling, were key in analyzing the complex, longitudinal EMA data.
For the most recent study, they took all those factors and developed a new machine-learning technique of Multi-Modal Multi-Task Learning to effectively combine these different types of data to predict multiple recovery outcomes.
In this approach, the AI learns to weigh the relatedness among the outcomes while capturing their differences from the multimodal data, Lu adds.
This method takes shared information on interrelated tasks of predicting different outcomes and then leverages the shared information to help the model understand how to make an accurate prediction, according to Xu.
It all comes together in the final package, producing a predicted change for each patients post-operative pain interference and physical function score.
Greenberg says the study is ongoing as the researchers continue to fine-tune their models so they can take more detailed assessments, predict outcomes and, most notably, understand what types of factors can potentially be modified to improve longer-term outcomes.
Funding for the study came from AO Spine North America, the Cervical Spine Research Society, the Scoliosis Research Society, the Foundation for Barnes-Jewish Hospital, Washington University/BJC Healthcare Big Ideas Competition, the Fullgraf Foundation, and the National Institute of Mental Health.
Source: Washington University in St. Louis
Read more from the original source:
AI better predicts back surgery outcomes - Futurity: Research News
How to think about the economics of AI – Top1000funds.com
Posted: at 2:48 am
The most underrated area of innovation in artificial intelligence is not in computing, nor is it in the development of algorithms or techniques for data collection. It is in the human ability to recast problems in terms of predictions.
Leading economist and academic Ajay Agrawal told the Fiduciary Investors Symposium in Toronto that it helps to think of AI and machine learning as simply a drop in the cost of prediction.
Agrawal serves as the Geoffrey Taber Chair in Entrepreneurship and Innovation at the University of Torontos Rotman School of Management, as well as being aProfessor of Strategic Management.
AI is computational statistics that does prediction, Agrawal said.
Thats all it is. And so, on the one hand, that seems very limiting. On the other hand, the thing thats so remarkable about it is all the things weve discovered that we can do with high fidelity prediction.
Agrawal said prediction is, in simple terms, taking information you have to generate information you dont have. And its the creativity of people to recast problems, that none of us in this room characterised as prediction problems, into prediction that underpins developments in and the potential of AI, he said.
Five years ago, probably nobody in this room would have said driving is a prediction problem.
Very few people in the room would have said translation is a prediction problem. Very few of you would have said replying to email is a prediction problem. But thats precisely how were solving all those things today.
Whether its predictive text when replying to an email or enhancing investment performance, the supporting AI systems are all implementations of statistics and prediction, Agrawal said.
These prediction models reached a zenith in large language models (LLMs), where machines were trained on how to predict the next most likely word in a sequence of words that made up sentences, paragraphs and whole responses.
If you think about language, lets say English, every book, every poem, every scripture that youve ever read, is a resequencing of the samecharacters: 26 letters, a few punctuation marks just re-sequenced over and over again makes all the books. What if we could do that with actions? Agrawal said.
The principles of LLMs (next most likely word) are now being applied to large behavioural models robots by training them to predict the next most likely verb or action.
In that case, we could take all the tasks think about everyone that you know, every job they do, and every job probably has 30 or 40 different tasks, so theres hundreds of thousands of tasks. But what if all those tasks are just really sequences of a small number of verbs?
So what theyre doing is theyre training that robots to do a handful verbs 50, 80, 120 verbs. Then you give the robot a prompt, just like chat GPT. You say to the robot, can you please unpack those boxes and put the tools on the shelf? The robot hears the prompt, and then predicts what is the optimal sequence of verbs in order to complete the task.
It is, Agrawal said, another application of prediction.
Agrawal said that businesses and industries are now facing a tidal wave of problems that have been recast as prediction problems.
So we now are pointing machine intelligence at many of these.
The problem is, it has come so hard and so fast, that people seem to be struggling with where do we start? And how do we actually point this towards something useful?
Agrawal said it pays to be very specific about the metric or the performance measure that needs to be improved, and then [point] the AI at that.
AIs are mathematical optimisers, they have to know what theyre optimising towards, he said.
If the problem is a tidal wave of new solutions available, and the problem is we dont know how to harness it, here is a way to think about the solution a short-term and a long-term strategy.
Agrawal said short-term strategies are basically productivity enhancements. Theyre deployable within a year, aim for 20 per cent productivity gains, and have a payback period of no more than two years.
And heres the key point, no change in the workflow, he said.
In other words, its truly a technology project where you just drop it in, but the rest of the system stays the same.
Long-term strategies take longer to deploy but theyre genuine game-changers, offering gains 10 times or more greater than short-term deployments. But critically, they require a redesign of workflows. Agrawal said AI, like electricity, is a general-purpose technology,
a useful analogy is when factories were first electrified and started to move away from stream-powered engines.
In the first 20 years after electricity was invented, there was very low take-up less than 3 per cent of factories used electricity, and when they did, the main value propositionwas it will reduce your input costs by doing things like replacing gas lamps.
Nobody wanted to tear apart their existing infrastructure in order to have that marginal benefit, Agrawal said.
The only ones that were experimenting with electricity were entrepreneurs building new factories, and even then, most of them said, No, I want to stick with what I know in terms of factory design.
But a few entrepreneurs realised there was a chance to completely reimagine and redesign a factory that was powered by electricity, because no longer was it dependent on transmitting power from engines outside the factory via long steel shafts to drive the factory machinery.
When the shafts became obsolete, so did the large columns inside the factories to support them. And that opened the door to lightweight, lower-cost construction, and factory design and layout changed to having everything on one level.
They redesigned the entire workflow, Agrawal said.
The machines, the materials, the material handling, the people flow, everything [was] redesigned. Some of the factories got up to 600 per cent productivity lift.
Agrawal said initially, the productivity differences between electrified and non-electrified factories were very small.
You could be operating a non-electrified factory and think those guys who want the newfangled electricity, its more trouble than its worth, he said.
But the but the productivity benefits just started taking off from electricity.
Now were seeing the same thing with machine intelligence [and] the adoption rate of AI.
However, Agrawal said the characteristic that makes AI different from every other tool weve ever had in human history, is this the only one that learns from us.
He said this explains the headlong development rush and the commitment of so much capital to the technology.
The way AI works is that whoever gets an early lead, their AI gets better; when their AI gets better, they get more users; when they get more users, they get more data; when they get more data, then the AI the prediction improves, he said.
And so, once they get that flywheel turning, it gets very hard to catch up to them.
Agrawal said AI and machine learning is developing so quickly its virtually impossible for companies and businesses to keep up, let alone implement and adapt.
The thing I would pay attention to is not so much the technology capability, because obviously thats important and its moving quickly, he said.
But what Im watching are the unit economics of the companies who are first experimenting with it, and then putting it into production, he said.
Cost just keeps going down because the AI is learning and getting better. And so that, like my sense there is, just pay very laser-close attention to the unit economics of what it costs to do a thing.
And you can go right down the stack of every good and service watching how, when you start applying these machine intelligence solutions to that thing, do the unit economics change?
Read more from the original source:
Deep learning models for predicting the survival of patients with hepatocellular carcinoma based on a surveillance … – Nature.com
Posted: at 2:48 am
Data description
In this study, 35,444 HCC patients were screened from the SEER database between 2010 and 2015, with 2197 patients meeting the criteria for inclusion. Table 1 shows the patients main baseline clinical characteristics (eTable 1 in the Supplement). Among the 2197 participants, 70% (n=1548) were aged 66years and below, 23% (n=505) were between 66 and 77years old, and 6.6% (n=144) were over 77years old. Male participants accounted for 78% (n=1915), while females represented 22% (n=550). In terms of race, the majority of participants were White, accounting for 66% (n=1455), followed by Asians or Pacific Islanders at 22% (n=478), Black individuals at 10% (n=228), and Native Americans/Alaskan Natives at only 1.6% (n=36). Regarding marital status, 60% (n=1319) were married, and the remaining 40% (n=878) were of other marital statuses. Histologically, most participants (98%, n=2154) were of type 8170. Additionally, 50% (n=1104) of the patients were grade II differentiated, 18% (n=402) were grade III, 1.0% (n=22) were grade IV, and 30% (n=669) were grade I. In terms of tumor staging, 48% (n=1054) of participants were at stage I, 29% (n=642) at stage II, 16% (n=344) at stage III, and 7.1% (n=157) at stage IV. Regarding the TNM classification, 49% (n=1079) were T1, 31% (n 1=677) were T2, 96% (n=2114) were N0, and 95% (n=2090) were M0. 66% (n=1444) of the participants had a positive/elevated AFP. 70% (n=1532) showed high levels of liver fibrosis. 92% (n=2012) had a single tumor, while the remaining 8.4% (n=185) had multiple tumors. 32% (n=704) underwent lobectomy, 14% (n=311) underwent local tumor destruction, 34% (n=753) had no surgery, and 20% (n=429) underwent wedge or segmental resection. Finally, 2.1% (n=46) received radiation therapy, with 62% (n=1352) not receiving chemotherapy and 38% (n=855) undergoing chemotherapy. The average overall survival (OS) in months for participants was 4534months, with 1327 (60%) surviving at the end of follow-up.
Following univariate Cox regression analysis, we identified several factors significantly correlated with the survival rate of hepatocellular carcinoma patients (p<0.05). These factors included age, race, marital status, histological type, tumor grade, tumor stage, T stage, N stage, M stage, alpha-fetoprotein levels, tumor size, type of surgery, and chemotherapy status. These variables all significantly impacted patient survival in the univariate analysis. However, in the multivariate Cox regression analysis, we further confirmed that only age, marital status, histological type, tumor grade, tumor stage, and tumor size were independent factors affecting patient survival (p<0.05) (Table 1). Additionally, through collinearity analysis, we observed a significant high degree of collinearity between tumor staging (Stage) and the individual stages of T, N, and M (Fig.1). This phenomenon occurs primarily because the overall tumor stage (Stage) is directly determined based on the results of the TNM assessment. This collinearity suggests the need for cautious handling of these variables during modeling to avoid overfitting and reduced predictive performance. Despite certain variables not being identified as independent predictors in multivariable analysis, we incorporated them into the construction of our deep learning model for several compelling reasons. Firstly, these variables may capture subtle interactions and nonlinear relationships that are not readily apparent in traditional regression models, but can be discerned through more sophisticated modeling techniques such as deep learning. Secondly, including a broader set of variables may enhance the generalizability and robustness of the model across diverse clinical scenarios, allowing it to better account for variations among patient subgroups or treatment conditions. Based on this analysis, we ultimately selected 12 key factors (age, race, marital status, histological type, tumor grade, T stage, N stage, M stage, alpha-fetoprotein, tumor size, type of surgery, chemotherapy) for inclusion in the construction of the predictive model. We divided the dataset into two subsets: a training set containing 1537 samples and a test set containing 660 samples (Table 2). By training and testing the model on these data, we aim to develop a model that can accurately predict the survival rate of hepatocellular carcinoma patients, assisting in clinical decision-making and improving patient prognosis.
Correlation coeffcients for each pair of variables in the data set.
Initially, we conducted fivefold cross-validation on the training set and performed 1000 iterations of random search. Among all these validations, we selected parameters that showed the highest average concordance index (C-index) and identified them as the optimal parameters. Figure2 displays the loss function graphs for the two deep learning models, NMTLR and DeepSurv. This set of graphs reveals the loss changes of these two models during the training process.
Loss convergence graph for (A) DeepSurv, (B) neural network multitask logistic regression (N-MTLR) models.
When comparing the machine learning models with the standard Cox Proportional Hazards (CoxPH) model in terms of predictive performance, Table 3 presents the performance of each model on the test set. In our analysis, we employed the log-rank test to compare the concordance indices (C-index) across models. The results indicated that the three machine learning modelsDeepSurv, N-MTLR, and RSFdemonstrated significantly superior discriminative ability compared to the standard Cox Proportional Hazards (CoxPH) model (p<0.01), as detailed in Table 4. Specifically, the C-index for DeepSurv was 0.7317, for NMTLR was 0.7353, and for RSF was 0.7336, compared to only 0.6837 for the standard CoxPH model. Among these three machine learning models, NMTLR had the highest C-index, demonstrating its superiority in predictive performance. Further analysis of the Integrated Brier Score (IBS) for each model revealed that the IBS for the four models were 0.1598 (NMTLR), 0.1632 (DeepSurv), 0.1648 (RSF), and 0.1789 (CoxPH), respectively (Fig.3). The NMTLR model had the lowest IBS value, indicating its best performance in terms of uncertainty in the predictions. Additionally, there was no significant difference between the C-indices obtained from the training and test sets, suggesting that the NMTLR model has better generalization performance in the face of real-world complex data and can effectively avoid the phenomenon of overfitting.
Through calibration plots (Fig.4), we observed that the NMTLR model demonstrated the best consistency between model predictions and actual observations in terms of 1-year, 3-year, and 5-year overall survival rates, followed by the DeepSurv model, RSF model, and CoxPH model. This consistency was also reflected in the AUC values: for the prediction of 1-year, 3-year, and 5-year survival rates, the NMTLR and DeepSurv models had higher AUC values than the RSF and CoxPH models. Specifically, the 1-year AUC values were 0.803 for NMTLR and 0.794 for DeepSurv, compared to 0.786 for RSF and 0.766 for CoxPH; the 3-year AUC values were 0.808 for NMTLR and 0.809 for DeepSurv, compared to 0.797 for RSF and 0.772 for CoxPH; the 5-year AUC values were 0.819 for both DeepSurv and NMTLR, compared to 0.812 for RSF and 0.772 for CoxPH. The results indicate that, in predicting the survival prognosis of patients with hepatocellular carcinoma, the deep learning modelsDeepSurv and NMTLRdemonstrate higher accuracy than the RSF and the classical CoxPH models. The NMTLR model significantly exhibited the best performance in multiple evaluation metrics.
The receiver operating curves (ROC) and calibration curves for 1-, 3-, 5-year survival predictions. ROC curves for (A) 1-, (C) 3-, (E) 5-year survival predictions. Calibration curves for (B) 1-, (D) 3-, (F) 5-year survival predictions.
In the feature analysis of deep learning models, the impact of a feature on model accuracy when its values are replaced with random data can be measured by the percentage decrease in the concordance index (C-index). A higher decrease percentage indicates the feature's significant importance in maintaining the model's predictive accuracy. Figure5 shows the feature importance heatmaps for the DeepSurv, NMTLR, and RSF models.
Heatmap of feature importance for DeepSurv, neural network multitask logistic regression (NMTLR) and random survival forest (RSF) models.
In the NMTLR model, the replacement of features such as age, race, marital status, histological type, tumor grade, T stage, N stage, alpha-fetoprotein, tumor size, type of surgery, and chemotherapy led to an average decrease in the concordance index by more than 0.1%. In the DeepSurv model, features like age, race, marital status, histological type, T stage, N stage, alpha-fetoprotein, tumor size, and type of surgery saw a similar average decrease in the concordance index when replaced with random data. In the RSF model, we found that features including age, race, tumor grade, T stage, M stage, tumor size, and type of surgery significantly impacted the model's accuracy, as evidenced by a noticeable decrease in the C-index, averaging a reduction of over 0.1% when replaced with random data.
In the training cohort, the NMTLR model was employed to predict patient risk probabilities. Optimal threshold values for these probabilities were determined using X-tile software. Patients were stratified into low-risk (<178.8), medium-risk (178.8248.4), and high-risk (>248.4) categories based on these cutoff points. Statistically significant differences were observed in the survival curves among the groups, with a p-value of less than 0.001, as depicted in Fig.6A. Similar results were replicated in the external validation cohort, as shown in Fig.6B, underscoring the robust risk stratification capability of the NMTLR model.
KaplanMeier curves evaluated the risk stratification ability of NMTLR model.
The web application developed in this study, primarily intended for research or informational purposes, is publicly accessible at http://120.55.167.119:8501/. The functionality and output visualization of this application are illustrated in Fig.7 and eFigure 1 in the Supplement.
The online web-based application of NMTLR model.
Visit link:
Enhancing customer retention in telecom industry with machine learning driven churn prediction | Scientific Reports – Nature.com
Posted: at 2:48 am
Kimura, T. Customer churn prediction with hybrid resampling and ensemble learning. J. Manag. Inform. Decis. Sci. 25(1), 123 (2022).
MathSciNet Google Scholar
Lalwani, P., Mishra, M.K., Chadha, J.S. and Sethi, P. Customer churn prediction system: a machine learning approach.Computing, pp.124 (2022).
Hadden, J., Tiwari, A., Roy, R. & Ruta, D. Computer assisted customer churn management: State-of- the-art and future trends. Comput. Oper. Res. 34(10), 29022917 (2007).
Article Google Scholar
Rajamohamed, R. & Manokaran, J. Improved credit card churn prediction based on rough clustering and supervised learning techniques. Clust. Comput. 21(1), 6577 (2018).
Article Google Scholar
Backiel, A., Baesens, B. & Claeskens, G. Predicting time-to-churn of prepaid mobile telephone customers using social network analysis. J. Operat. Res. Soc. 67(9), 11351145. https://doi.org/10.1057/jors.2016.8 (2016).
Article Google Scholar
Zhu, B., Baesens, B. & Vanden Broucke, S. K. An empirical comparison of techniques for the class imbalance problem in churn prediction. Inform. Sci. 408, 8499. https://doi.org/10.1016/j.ins.2017.04.015 (2017).
Article Google Scholar
Vijaya, J. & Sivasankar, E. Computing efficient features using rough set theory combined with ensemble classification techniques to improve the customer churn prediction in telecommunication sector. Computing 100(8), 839860 (2018).
Article Google Scholar
Ahmad, S. N. & Laroche, M. S. Analyzing electronic word of mouth: A social commerce construct. Int. J. Inform. Manag. 37(3), 202213 (2017).
Article Google Scholar
Gaurav Gupta, S. A critical examination of different models for customer churn prediction using data mining. Int. J. Eng. Adv. Technol. 6(63), 850854 (2019).
Google Scholar
Abbasimehr, H., Setak, M. & Tarokh, M. A neuro-fuzzy classifier for customer churn prediction. Int. J. Comput. Appl. 19(8), 3541 (2011).
Google Scholar
Kumar, S. & Kumar, M. Predicting customer churn using artificial neural network. In Engineering Applications of Neural Networks: 20th International Conference, EANN 2019, Xersonisos, Crete, Greece, May 24-26, 2019, Proceedings (eds Macintyre, J. et al.) 299306 (Springer International Publishing, 2019). https://doi.org/10.1007/978-3-030-20257-6_25.
Chapter Google Scholar
Sharma, T., Gupta, P., Nigam, V. & Goel, M. Customer churn prediction in telecommunications using gradient boosted trees. In International Conference on Innovative Computing and Communications: Proceedings of ICICC 2019 Vol. 2 (eds Khanna, A. et al.) 235246 (Springer Singapore, 2020). https://doi.org/10.1007/978-981-15-0324-5_20.
Chapter Google Scholar
Umayaparvathi, V. & Iyakutti, K. A survey on customer churn prediction in telecom industry: Datasets, methods and metrics. Int. Res. J. Eng. Technol. 4(4), 10651070 (2016).
Google Scholar
Ahmad, A. K., Jafar, A. & Aljoumaa, K. Customer churn prediction in telecom using machine learning in big data platform. J. Big Data 6(1), 28 (2019).
Article Google Scholar
Extracted from: https://www.kaggle.com/competitions/customer-churn-prediction-2020/data?select=test.csv
Mishra, A. & Reddy, U. S. A comparative study of customer churn prediction in telecom industry using ensemble based classifiers. In 2017 International Conference on Inventive Computing and Informatics (ICICI). IEEE, 721725. (2017)
Coussement, K., Lessmann, S. & Verstraeten, G. A comparative analysis of data preparation algorithms for customer churn prediction: A case study in the telecommunication industry. Decis. Support Syst. 95, 2736 (2017).
Article Google Scholar
Wang, Q. F., Xu, M. & Hussain, A. Large-scale ensemble model for customer churn prediction in search ads. Cogn. Comput. 11(2), 262270 (2019).
Article Google Scholar
Hashmi, N., Butt, N. A. & Iqbal, M. Customer churn prediction in telecommunication a decade review and classification. Int. J. Comput. Sci. Issues 10(5), 271 (2013).
Google Scholar
Eria, K. & Marikannan, B. P. Systematic review of customer churn prediction in the telecom sector. J. Appl. Technol. Innovat. 2(1), 714 (2018).
Google Scholar
Brnduoiu, I., Toderean, G. & Beleiu, H. Methods for churn prediction in the pre-paid mobile telecommunications industry. In 2016 International conference on communications (COMM), 97100. IEEE. (2016)
Singh, M., Singh, S., Seen, N., Kaushal, S., & Kumar, H. Comparison of learning techniques for prediction of customer churn in telecommunication. In 2018 28th International Telecommunication Networks and Applications Conference (ITNAC) IEEE, pp. 15. (2018)
Lee, E. B., Kim, J. & Lee, S. G. Predicting customer churn in the mobile industry using data mining technology. Ind. Manag. Data Syst. 117(1), 90109 (2017).
Article Google Scholar
Bharadwaj, S., Anil, B. S., Pahargarh, A., Pahargarh, A., Gowra, P. S., & Kumar, S. Customer Churn Prediction in Mobile Networks using Logistic Regression and Multilayer Perceptron (MLP). In 2018 Second International Conference on Green Computing and Internet of Things (ICGCIoT), IEEE. pp. 436438, (2018)
Chen, T. & Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, 785794. (2016)
Dhaliwal, S. S., Nahid, A. A. & Abbas, R. Effective intrusion detection system using XGBoost. Information 9(7), 149 (2018).
Article Google Scholar
Baesens, B., Hppner, S. & Verdonck, T. Data engineering for fraud detection. Decis. Support Syst. 150, 113492 (2021).
Article Google Scholar
Zhou, H., Chai, H. F. & Qiu, M. L. Fraud detection within bankcard enrollment on mobile device based payment using machine learning. Front. Inform. Technol. Electron. Eng. 19(12), 15371545 (2018).
Article Google Scholar
Pamina, J., Raja, B., SathyaBama, S. & Sruthi, M. S. An effective classifier for predicting churn in telecommunication. J. Adv. Res. Dyn. Control Syst. 11, 221229 (2019).
Google Scholar
Kuhn, M. & Johnson, K. Applied Predictive Modeling 26th edn. (Springer, 2013).
Book Google Scholar
Yijing, L., Haixiang, G., Xiao, L., Yanan, L. & Jinling, L. Adapted ensemble classification algorithm based on multiple classifier system and feature selection for classifying multi-class imbalanced data. Knowl. -Based Syst. 94, 88104 (2016).
Article Google Scholar
Verbeke, W., Martens, D., Mues, C. & Baesens, B. Building comprehensible customer churn prediction models with advanced rule induction techniques. Expert Syst. Appl. 38(3), 23542364 (2011).
Article Google Scholar
Burez, J. & Van den Poel, D. Handling class imbalance in customer churn prediction. Expert Syst. Appl. 36(3), 46264636 (2009).
Article Google Scholar
Lpez, V., Fernndez, A., Garca, S., Palade, V. & Herrera, F. An insight into classification with imbalanced data: Empirical results and current trends on using data intrinsic characteristics. Inform. Sci. 250, 113141 (2013).
Article Google Scholar
Kaur, H., Pannu, H. S. & Malhi, A. K. A systematic review on imbalanced data challenges in machine learning: Applications and solutions. ACM Comput. Surv. (CSUR) 52(4), 136 (2019).
Google Scholar
Salunkhe, U. R. & Mali, S. N. A hybrid approach for class imbalance problem in customer churn prediction: A novel extension to under-sampling. Int. J. Intell. Syst. Appl. 11(5), 7181 (2018).
Google Scholar
Galar, M., Fernandez, A., Barrenechea, E., Bustince, H. & Herrera, F. A review on ensembles for the class imbalance problem: Bagging-, boosting-, and hybrid-based approaches. IEEE Trans. Syst. Man Cybern. Part C 42(4), 463484. https://doi.org/10.1109/TSMCC.2011.2161285 (2012).
Article Google Scholar
Singh, A. & Purohit, A. A survey on methods for solving data imbalance problem for classification. Int. J. Comput. Appl. 127(15), 3741 (2015).
Google Scholar
Schaefer, G., Krawczyk, B., Celebi, M. E. & Iyatomi, H. An ensemble classification approach for melanoma diagnosis. Memetic Comput. 6(4), 233240 (2014).
Article Google Scholar
Salunkhe, U. R. & Mali, S. N. Classifier ensemble design for imbalanced data classification: A hybrid approach. Proc. Comput. Sci. 85, 725732 (2016).
Article Google Scholar
Liu, X. Y., Wu, J. & Zhou, Z. H. Exploratory undersampling for class-imbalance learning. IEEE Trans. Syst. Man Cybern. Part B Cybern. 39(2), 539550 (2008).
Google Scholar
Haixiang, G., Yijing, L., Shang, J. & Mingyun, G. Learning from class-imbalanced data: Review of methods and applications. Expert Syst. Appl. 73, 220239 (2017).
Article Google Scholar
Douzas, G., Bacao, F. & Last, F. Improving imbalanced learning through a heuristic oversampling method based on k-means and SMOTE. Inform. Sci. 465, 120. https://doi.org/10.1016/j.ins.2018.06.056 (2018).
Article Google Scholar
Mahesh, B. Machine learning algorithms-a review. Int. J. Sci. Res. 9, 381386 (2020).
Google Scholar
Bonaccorso, G. Machine Learning Algorithms (Packt Publishing Ltd., 2017).
Google Scholar
Ray, S. A quick review of machine learning algorithms. In2019 International conference on machine learning, big data, cloud and parallel computing (COMITCon). IEEE. pp. 3539, (2019)
Singh, A., Thakur, N. and Sharma, A., A review of supervised machine learning algorithms. In2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom), pp. 13101315. 2016
Ayodele, T. O. Types of machine learning algorithms. New Adv. Mach. Learn. 3, 1948 (2010).
Google Scholar
Sagi, O. & Rokach, L. Ensemble learning: A survey. Wiley Interdisciplin. Rev.: Data Min. Knowled. Discov. 8(4), e1249 (2018).
Google Scholar
Zhang, C. & Ma, Y. (eds) Ensemble Machine Learning: Methods and Applications (Springer Science & Business Media, 2012).
Google Scholar
Amin, A., Adnan, A. & Anwar, S. An adaptive learning approach for customer churn prediction in the telecommunication industry using evolutionary computation and Nave Bayes. Appl. Soft Comput. 137, 110103 (2023).
Article Google Scholar
Amin, A. et al. Customer churn prediction in the telecommunication sector using a rough set approach. Neurocomputing 237, 242254 (2017).
Article Google Scholar
Amin, A., Shah, B., Khattak, A. M., Baker, T., & Anwar, S. Just-in-time customer churn prediction: With and without data transformation. In2018 IEEE congress on evolutionary computation (CEC), IEEE, pp. 16. (2018).
Amin, A., Shah, B., Abbas, A., Anwar, S., Alfandi, O., & Moreira, F. Features weight estimation using a genetic algorithm for customer churn prediction in the telecom sector. InNew Knowledge in Information Systems and Technologies: Vol. 2. Springer International Publishing. pp. 483491, (2019)
Chaubey, G. et al. Customer purchasing behavior prediction using machine learning classification techniques. J. Ambient Intell. Hum. Comput. https://doi.org/10.1007/s12652-022-03837-6 (2022).
Article Google Scholar
Thomas, W. E., & David, O. M. Chapter 4exploratory study.Research methods for cyber security, Syngress, 95130 (2017).
View post:
Developing a prognostic model using machine learning for disulfidptosis related lncRNA in lung adenocarcinoma … – Nature.com
Posted: at 2:48 am
Identification of prognostically relevant DRLs and construction of prognostic models
In our investigation of the LUAD landscape, we analyzed 16,882 lncRNAs derived from the TCGA-LUAD database. This comprehensive evaluation led to the identification of 708 DRLs, which demonstrate significant interactions with DRGs, as depicted in a sankey diagram (Fig.2A). Through further analysis incorporating data from three GEO databases, we narrowed these DRLs down to 199 lncRNAs consistently present across datasets, suggesting a pivotal role in LUAD pathogenesis (Fig.2B). Our prognostic assessment using univariate cox regression analysis revealed 37 lncRNAs with significant implications for LUAD patient outcomes (Fig.2C). Leveraging these lncRNAs, we constructed a predictive model employing an ensemble of machine learning techniques, with the ensemble model (Supplementary Table 2) achieving a notably high C-index of 0.677[95% confidence interval (CI) 0.63 to 0.73], suggesting robust predictive performance (Fig.2D). This model's effectiveness was further validated through a risk stratification system, categorizing patients into high and low-risk groups based on their lncRNA expression profiles. This stratification was substantiated by principal component analysis (PCA), which confirmed the distinct separation between the risk groups, underscoring the potential of our model in clinical risk assessment (Fig.2E).
Construction of prognostic model composed of 27 DRLs. (A) Sankey diagram illustrating the relationship between DRGs and associated lncRNAs. (B) The intersection of DRLs sourced from the TCGA database and GEO database. (C) 27 lncRNAs after univariate Cox regression. (D) 101 prediction models evaluated, with C-index calculated for each across all validation datasets. (E) Principal Component Analysis of the low-risk and high-risk cohorts based on 27 DRLs.
Our survival analysis using the TCGA-LUAD dataset revealed a significant distinction in OS between the high- and low-risk groups identified through our model (p<0.001, log-rank test) (Fig.3A). This finding was consistently replicated across three independent GEO datasets, demonstrating significant differences in both OS (GSE31210, p=0.001; GSE30219, p=0.019; GSE50081, p=0.025) (Fig.3BD) and DFS (GSE31210, p<0.001; GSE30219, p=0.009; GSE50081, p=0.023) (Supplementary Fig. S1AC). The predictive power of the risk score was superior to that of traditional prognostic factors such as age, gender, and staging, as evidenced by the C-index comparison (Supplementary Fig. S1D). The risk score also emerged as an independent prognostic indicator in our univariate and multivariate cox analyses (p<0.001) (Supplementary Table 3). Multicollinearity within the model was assessed using the variance inflation factor, which was below 10 for all variables (Supplementary Table 4). The AUC analysis further validated the robustness of our model, with one-year, two-year, and three-year AUCs of 0.76, 0.72, and 0.74, respectively, in the TCGA-LUAD dataset (Fig.3F). The external validation using GEO datasets underscored the model's accuracy, particularly notable in GSE30219, GSE50081 and GSE31210 for the evaluated intervals (Fig.3G,I).
Efficacy of the DRLs Survival Prognostic Risk Model. KaplanMeier (KM) analysis for high-risk and low-risk groups are exhibited in (A) TCGA-LUAD, (B) GSE31210, (C) GSE30219 and (D)GSE50081. (E) KaplanMeier (KM) survival curves for mutant and non-mutant groups. Analysis of 1-, 2-, and 3-year ROC curves for (F) TCGA-LUAD, (G) GSE30219, (H) GSE50081, and (I) GSE31210.
Further analysis showed gender-specific differences in risk scores across various pathological stages. In early stages (I and II), men exhibited significantly higher risk scores compared to women (Stage I: p=0.015; Stage II: p=0.006; Wilcoxon test) (Supplementary Fig. S2A,B). However, these differences were not observed in later stages (III/IV) (p=0.900, Wilcoxon test) (Supplementary Fig. S2C), suggesting stage-specific risk dynamics. In addition, our study uncovered notable disparities in risk scores among patients with mutations in EGFR, ALK, and KRAS genes in the GSE31210 dataset (p<0.001, KruskalWallis test) (Supplementary Fig. S2D). Patients harboring these mutations also exhibited better OS compared to those without (p=0.018, log-rank test) (Fig.3E), highlighting the potential prognostic relevance of genetic profiles in LUAD. The impact of smoking, a known risk factor for LUAD, was evident as significant differences in risk scores between smokers and non-smokers were observed in analyses of the GSE30210 and GSE50081 datasets (GSE31210, p=0.003; GSE50081, p=0.027; Wilcoxon test) (Supplementary Fig. S2E,F).
To enhance our model's utility in clinical decision-making, we developed a nomogram that incorporates the identified risk scores alongside essential clinical parametersage, gender, and TNM staging. This integration aims to provide a more comprehensive tool for predicting the prognosis of LUAD patients (Fig.4A). We rigorously validated the nomogram's predictive accuracy using calibration curves, which compare the predicted survival probabilities against the observed outcomes. The results demonstrated a high degree of concordance, indicating that our nomogram accurately reflects patient survival rates (Fig.4B). Further assessment through DCA (Fig.4C-E) confirmed that the nomogram provides substantial clinical benefit. Notably, the analysis showed that the nomogram significantly outperforms the predictive capabilities of the risk score alone, particularly in terms of net benefit across a wide range of threshold probabilities.
Development of a Nomogram for Risk Prediction & Analysis of Mutation Patterns in Both Risk Groups. (A) Nomogram that combines model and clinicopathological factors. (B) Calibration curves in 1-, 3-, and 5-year for the nomogram. (CE) The decision curves analysis (DCA) of the nomogram and clinical characteristics in 1-, 3-, and 5-year. (F) TMB levels between the high-risk and low-risk groups. (G) Gene mutation waterfall chart of the low-risk group. (H) Gene mutation waterfall chart of the high-risk group.
A marked difference in TMB was discerned between the high- and low-risk cohorts (p<0.001 by wilcoxon test) (Fig.4F). The waterfall plot delineates the mutational landscape of the ten most prevalent genes across both risk strata. In the low-risk cohort, approximately 84.53% of specimens exhibited gene mutations (Fig.4G), whereas in the high-risk stratum, mutations were observed in roughly 95.33% of specimens (Fig.4H). Predominant mutations within the high-risk category included TP53, TTN, and CSMD3.
The differential expression analysis revealed a total of 1474 DEGs between the low-risk and high-risk cohorts. Among these, 568 genes were upregulated and 906 genes were downregulated. The volcano plot (Supplementary Fig. S2G) illustrates the distribution of these DEGs. These results indicate that specific genes are significantly associated with risk stratification in our study cohort. In the GO analysis (Fig.5A,D), DEGs showed predominant enrichment in terms of molecular functions such as organic anion transport, carboxylic acid transport. Regarding cellular components, the main enrichment was observed in the apical plasma membrane (Fig.5C). Figure5E demonstrates the GSEA results, highlighting significant enrichment of specific gene sets related to metabolic processes, DNA binding, and hyperkeratosis. The KEGG result highlighted a significant enrichment of DEGs in neuroactive ligand-receptor interaction and the cAMP signaling pathway (Fig.5B).
Biological function analysis of the DRLs risk score model. The top 5 significant terms of (A) GO function enrichment and (B) KEGG function enrichment. (C,D) System clustering dendrogram of cellular components. (E) Gene set enrichment analysis.
To validate the precision of our results, we employed seven techniques: CIBERSORT, EPIC, MCP-counter, xCell, TIMER, quanTIseq, and ssGSEA, to assess immune cell penetration in both high-risk and low-risk categories (Fig.6A). With the ssGSEA data, we explored the connection between TME and several characteristics of lung adenocarcinoma patients, such as age, gender, and disease stage (Fig.6B). We then visualized this data with box plots for both CIBERSORT and ssGSEA (Fig.6C,D). These plots showed that the infiltration levels of B cells memory, T cells CD4 memory resting, and Monocyte was notably lower in the high-risk group compared to the low-risk group. With the help of the ESTIMATE algorithm, we evaluated the stromal (Fig.6F), immune (Fig.6E), and ESTIMATE scores (Supplementary Fig. S3A) across the different risk groups. This allowed us to gauge tumor purity. Our study suggests that the high-risk group has reduced stromal, ESTIMATE, and immune scores. Conversely, the score of tumor purity in the low-risk group is less than that in the high-risk group (Supplementary Fig. S3B).
The tumor microenvironment between high-risk and low-risk groups based on DRLs. (A) Comparing the levels of immune cell infiltration for different immune cell types in the CIBERSORT, EPIC, MCP-counter, xCell, TIMER and quanTIseq algorithm for low-risk and high-risk groups. (B) Immune infiltration of different lung adenocarcinoma patient characteristics. Box plot of the difference in immune cell infiltration between the high-risk and low-risk score groups based on (C) CIBERSORT and (D) ssGSEA. *p-value<0.05, **p-value<0.01, ***p-value<0.001, ns=no significance. (E) Immune score, and (F)stromal score were lower in the high-risk group than in the low-risk group.
We calculated the TIDE score and forecasted the immunotherapy response in both groups of the high risk and low risk (Fig.7A). Based on results from both datasets, patients in low-risk group seem more inclined to show a positive reaction to immunotherapy. Additionally, IPS for the combination of anti-CTLA4 and anti-PDL1 treatment, as well as for anti-CTLA4 alone, was consistently higher in the low-risk group (Fig.7B,C). However, the analysis of anti-PDL1 treatment alone (P=0.170) did not reach statistical significance (Fig.7D). This suggests that low-risk patients may respond better to anti-CTLA4 and/or anti-PDL1 immunotherapy. Recently, research has found a link between tumor TLS and outcomes in several tumor types. In line with these discoveries, our review of TCGA-LUAD dataset showed that LUAD patients with high TLS scores had more favorable outcomes than those with low scores (Fig.7F). We also noticed that the TLS score was higher in the low-risk group compared to the high-risk group (Fig.7E).
Immunotherapeutic sensitivity between high-risk and low-risk groups based on DRLs. (A) Differences in risk scores between the TIDE responsive and nonresponsive groups. (BD) Sensitivity of high- and low-risk groups to combination therapy, anti-CTLA4, and anti-PDL1 by different IPS scores. (E) Differences in tumor tertiary lymphoid structure (TLS) scores in high-risk and low-risk groups in TCGA-LUAD. (F) KM analysis of high-TLS and low-TLS groups.
In our assessment of the relationship between risk scores and sensitivity to chemotherapy, we measured the IC50 for some widely used chemotherapeutic medicine. Our findings showed that the high-risk group was more sensitive to drugs like Cisplatin, Vinblastine, Cytarabine, Vinorelbine, Bexarotene, Cetuximab, Docetaxel, and Doxorubicin than the low-risk group (Fig.8AP).
Immunotherapy sensitivity analysis and in-depth study of LINC00857. (AP) Differences in drug sensitivity between high-risk and low-risk groups. (Q) Volcano plot for GTEX_Lung vs. TCGA_Lung_ Adenocarcinoma.
Through differential gene analysis of tumor tissues and normal tissues, 13,995 DEGs (|logFC|>1.5, p-value<0.050) (Fig.8Q, Supplementary Fig. S3C) were identificated. By cross-referencing with the 27 lncRNAs that form our prognostic model, we pinpointed LINC01003. Supplementary Fig. S4A presents a heatmap demonstrating the expression levels of LINC01003 across different NSCLC datasets and cell types. The results indicate that LINC01003 is differentially expressed, with notable high expression in monocytes/macrophages and endothelial cells across several datasets, suggesting its potential involvement in these cell types within the NSCLC tumor microenvironment. Supplementary Figure S4B further illustrates the expression profile of LINC01003 in different cell populations from the GSE143423 dataset. The violin plot shows significant expression of LINC01003 in malignant cells, compared to other cell types, indicating its potential role in tumor progression.
To decipher the LINC00857 related regulatory mechanisms, we constructed a lncRNA-miRNA-mRNA network (Supplementary Fig. S4C). This network illustrates the intricate interactions between LINC00857 and various miRNAs and mRNAs. In this network, LINC00857 acts as a central regulatory hub, potentially influencing gene expression by sequestering multiple miRNAs, such as hsa-miR-4709-5p, hsa-miR-760, and hsa-miR-340-5p. These miRNAs, in turn, are connected to a wide array of target genes, including YWHAZ, BCL2L2, PTEN, and MYC, which are critical in cellular processes such as cell cycle regulation, apoptosis, and signal transduction.
See the original post here:
Assessing calibration and bias of a deployed machine learning malnutrition prediction model within a large healthcare … – Nature.com
Posted: at 2:48 am
The primary training cohort used to recalibrate the model included 49,652 patients (median [IQR] age = 66.0 [26.0]), of which 49.9% self-identified as female, 29.6% self-identified as Black or African American, 54.8% were on Medicare and 27.8% on Medicaid. 11,664 (24%) malnutrition cases were identified. Baseline characteristics are summarized in Table 1 and malnutrition event rates are summarized in Supplementary Table 2. The validation cohort used to test the model included 17,278 patients (median [IQR] age = 66.0 [27.0]), of which 49.8% self-identified as female, 27.1% self-identified as Black or African American, 52.9% were on Medicare, and 28.2% on Medicaid. 4,005 (23%) malnutrition cases were identified.
Although the model overall had a c-index of 0.81 (95% CI: 0.80, 0.81), it was miscalibrated according to both weak and moderate calibration metrics, with a Brier score of 0.26 (95% CI: 0.25, 0.26) (Table 2), indicating that the model is relatively inaccurate17. It also overfitted the risk estimate distribution, as evidenced by the calibration curve (Supplementary Fig. 1). Logistic recalibration of the model successfully improved calibration, bringing the calibration intercept to 0.07 (95% CI: 0.11, 0.03), calibration slope to 0.88 (95% CI: 0.86, 0.91), and significantly decreasing Brier score (0.21, 95% CI: 0.20, 0.22), Emax (0.03, 95% CI: 0.01, 0.05), and Eavg (0.01, 95% CI: 0.01, 0.02). Recalibrating the model improved specificity (0.74 to 0.93), PPV (0.47 to 0.60), and accuracy (0.74 to 0.80) while decreasing sensitivity (0.75 to 0.35) and NPV (0.91 to 0.83) (Supplementary Tables 2 and 3).
Weak and moderate calibration metrics between Black and White patients significantly differed prior to recalibration (Table 3, Supplementary Fig. 2A, B), with the model having a more negative calibration intercept for White patients on average compared to Black patients (1.17 vs. 1.07), and Black patients having a higher calibration slope compared to White patients (1.43 vs. 1.29). Black patients had a higher Brier score of 0.30 (95% CI: 0.29, 0.31) compared to White patients with 0.24 (95% CI: 0.23, 0.24). Logistic recalibration significantly improved calibration for both Black and White patients (Table 4, Fig. 1ac). For Black patients within the hold-out set, the recalibrated calibration intercept was 0 (95% CI: -0.07, 0.05), calibration slope was 0.91 (95% CI: 0.87, 0.95), and Brier score improved from 0.30 to 0.23 (95% CI: 0.21, 0.25). For White patients within the hold-out set, the recalibrated calibration intercept was -0.15 (95% CI: -0.20, -0.10), calibration slope was 0.82 (95% CI: 0.78, 0.85), and Brier score improved from 0.24 to 0.19 (95% CI: 0.18, 0.21). Post-recalibration, calibration for Black and White patients still differed significantly according to weak calibration metrics, but not so according to moderate calibration metrics and the strong calibration curves (Table 4, Fig. 1). Calibration curves of the recalibrated model showed good concordance between actual and predicted event probabilities, although the predicted risks for Black and White patients differed between the 30th and 60th risk percentiles. Logistic recalibration also improved the specificity, PPV, and accuracy, but decreased the sensitivity and NPV of the model across both White and Black patients (Supplementary Tables 2and 3). Discriminative ability was not significantly different for White and Black patients before and after recalibration. We also found calibration statistics to be relatively similar in Asian patients (Supplementary Table 4).
Columns from left to right are curves for a, No Recalibration b, Recalibration-in-the-Large and c, Logistic Recalibration for Black vs. White patients d, No Recalibration e, Recalibration-in-the-Large and f, Logistic Recalibration for male vs. female patients.
Calibration metrics between male and female patients also significantly differed prior to recalibration (Table 3, Supplementary Fig. 2C, D). The model had a more negative calibration intercept for female patients on average compared to male patients (1.49 vs. 0.88). Logistic recalibration significantly improved calibration for both male and female patients (Table 4, Fig. 1df). In male patients within the hold-out set, the recalibrated calibration intercept was 0 (95% CI: 0.05, 0.03), calibration slope was 0.88 (95% CI: 0.85, 0.90), and Brier score improved from 0.29 to 0.23 (95% CI: 0.22, 0.24). In female patients within the hold-out set, the recalibrated calibration intercept was 0.11 (95% CI: 0.16, 0.06), calibration slope was 0.91 (95% CI: 0.87, 0.94), but the Brier score did not significantly improve. After logistic recalibration, only calibration intercepts differed between male and female patients. Calibration curves of the recalibrated model showed good concordance, although the predicted risks for males and females differed between the 10th and 30th risk percentiles. Discrimination metrics for male and female patients were significantly different before recalibration. The model had a higher sensitivity and NPV for females than males, but a lower specificity, PPV, and accuracy (Supplementary Table 2). The recalibrated model had the highest sensitivity (0.95, 95% CI: 0.94, 0.96), NPV (0.84, 95% CI: 0.83, 0.85) and accuracy (0.82, 95% CI: 0.81, 0.83) for female patients, at the cost of substantially decreasing sensitivity (0.27, 95% CI: 0.25, 0.30) (Supplementary Table 3).
We also assessed calibration by payor type and hospital type as sensitivity analyses. In the payor type analysis, we found that malnutrition predicted risk was more miscalibrated in patients with commercial insurance with more extreme calibration intercepts, Emax, and Eavg suggesting overestimation of risk (Supplementary Tables 5 and 6, Supplementary Fig. 3A, B). We did not observe substantial differences in weak or moderate calibration across hospital type (community, tertiary, quaternary) except that tertiary acute care centers had a more extreme calibration intercept, suggesting an overestimation of risk (Supplementary Tables 7 and 8, Supplementary Fig. 3C, D). Across both subgroups, logistic recalibration significantly improved calibration across weak, moderate, and strong hierarchy tiers (Supplementary Table 5, Supplementary Table 7, Supplementary Figs. 4 and 5).
Go here to see the original:
5 Key Ways AI and ML Can Transform Retail Business Operations – InformationWeek
Posted: at 2:48 am
Odds are youve heard more about artificial intelligence and machine learning in the last two years than you had in the previous 20. Thats because advances in the technology have been exponential, and many of the worlds largest brands, from Walmart and Amazon to eBay and Alibaba, are leveraging AI to generate content, power recommendation engines, and much more.
Investment in this technology is substantial, with exponential growth projected -- the AI in retail market was valued at $7.14 billion in 2023, with the potential to reach $85 billion by 2032.
Brands of all sizes are eyeing this technology to see how it fits into their retail strategies. Lets take a look at some of the impactful ways AI and ML can be leveraged to drive business growth.
One of the major hurdles for retailers -- particularly those with large numbers of SKUs -- is creating compelling, accurate product descriptions for every new product added to their assortment. When you factor in the ever-increasing number of platforms on which a product can be sold, from third-party vendors like Amazon to social selling sites to a brands own website, populating that amount of content can be unsustainable.
One of the areas in which generative AI excels is creating compelling product copy at scale. Natural language generation (NLG) algorithms can analyze vast amounts of product data and create compelling, tailored descriptions automatically. This copy can also be adapted to each channel, fitting specific parameters and messaging towards focused audiences. For example, generative AI engines understand the word count restrictions for a particular social channel. They can focus copy to those specifications, tailored to the demographic data of the person who will encounter that message. This level of personalization at scale is astonishing.
Related:Is an AI Bubble Inevitable?
This use of AI has the potential to help brands achieve business objectives through product discoverability and conversion by creating compelling content optimized for search.
Another area in which AI and ML excel is in the cataloging and organizing of data. Again, when brands deal with product catalogs with hundreds of thousands of SKUs spread across many channels, it is increasingly difficult to maintain consistency and clarity of information. Product, inventory, and eCommerce managers spend countless hours attempting to keep all product information straight and up-to-date, and they still make mistakes.
Related:Is Innovation Outpacing Responsible AI?
Brands can leverage AI to automate tasks such as product categorization, attribute extraction, and metadata tagging, ensuring accuracy and scalability in data management across all channels. This use of AI takes the guesswork and labor out of meticulous tasks and can have wide-ranging business implications. More accurate product information means a reduction in returns and improved product searchability and discoverability through intuitive data architecture.
As online shopping has evolved over the past decade, consumer expectations have shifted. Customers rarely go to company websites and browse endless product pages to discover the product theyre looking for. Rather, customers expect a curated and personalized experience, regardless of the channel through which theyre encountering the brand. A report from McKinsey showed that 71% of customers expect personalization from a brand, and 76% get frustrated when they dont encounter it.
Brands have been offering personalized experiences for decades, but AI and ML unlock entirely new avenues for personalization. Once again, AI enables an unprecedented level of scale and nuance in personalized customer interactions. By analyzing vast amounts of customer data, AI algorithms can connect the dots between customer order history, preferences, location and other identifying user data and create tailored product recommendations, marketing messages, shopping experiences, and more.
Related:Overcoming AIs 5 Biggest Roadblocks
This focus on personalization is key for business strategy and hitting benchmarks. Personalization efforts lead to increases in conversion, higher customer engagement and satisfaction, and better brand experiences, which can lead to long-term loyalty and customer advocacy.
Search functionalities are in a constant state of evolution, and the integration of AI and ML is that next leap. AI-powered search algorithms are better able to process natural language, enabling a brand to understand user intent and context, which improves search accuracy and relevance.
Whats more, AI-driven search can provide valuable insights into customer behavior and preferences, enabling brands to optimize product offerings and marketing strategies. By analyzing search patterns and user interactions, brands can identify emerging trends, optimize product placement, and tailor promotions to specific customer segments. Ultimately, this enhanced search experience improves customer engagement while driving sales growth and fostering long-term customer relationships.
At its core, the main benefit of AI and ML tools is that theyre always working and never burn out. This fact is felt strongest when applied to customer support. Tools like chatbots and virtual assistants enable brands to provide instant, personalized assistance around the clock and around the world. This automation reduces wait times, improves response efficiency, and frees staff to focus on higher-level tasks.
Much like personalization engines used in sales, AI-powered customer support tools can process vast amounts of customer data to tailor responses based on a customers order history and preferences. Also, like personalization, these tools can be deployed to radically reduce the amount of time customer support teams spend on low-level inquiries like checking order status or processing returns. Leveraging AI in support allows a brand to allocate resources in more impactful ways without sacrificing customer satisfaction.
Brands are just scratching the surface of the capabilities of AI and ML. Still, early indicators show that this technology can have a profound impact on driving business growth. Embracing AI can put brands in a position to transform operational efficiency while maintaining customer satisfaction.
Go here to see the original:
5 Key Ways AI and ML Can Transform Retail Business Operations - InformationWeek
Machine learning helps find advantageous combination of salts and organic solvents for easier anti-icing operations – Phys.org
Posted: at 2:48 am
This article has been reviewed according to ScienceX's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's credibility:
fact-checked
peer-reviewed publication
trusted source
proofread
close
An Osaka Metropolitan University research team has found a deicing mixture with high effectiveness and low environmental impact after using machine learning to analyze ice melting mechanisms of aqueous solutions of 21 salts and 16 organic solvents. The research appears in Scientific Reports on June 7, 2024.
The dangers of frozen roads, airplane engines, and runways are well known, but the use of commercial products often means short-term safety over long-term environmental degradation. Seeking a better product, Osaka Metropolitan University researchers have developed a deicing mixture offering higher performance than deicers on the market while also having less impact on the environment.
The team, made up of graduate student Kai Ito, Assistant Professor Arisa Fukatsu, Associate Professor Kenji Okada, and Professor Masahide Takahashi of the Graduate School of Engineering, used machine learning to analyze ice melting mechanisms of aqueous solutions of 21 salts and 16 organic solvents. The group then conducted experiments to find that a mixture of propylene glycol and aqueous sodium formate solution showed the best ice penetration capacity.
Because of the mixture's effectiveness, less of the substance needs to be used, thereby also lessening the environmental impact. It is also not corrosive, preventing damage, for example, when used for airport runways.
"We are proposing an effective and environmentally friendly deicer that combines the advantages of salts and organic solvents," said Dr. Fukatsu.
The results of this research also provide new insights into the ice melting process.
"The development of highly efficient deicers is expected to make deicing and anti-icing operations easier," Professor Takahashi added. "This will also lessen the environmental impact by reducing the amount of deicer used."
More information: Machine learning-assisted chemical design of highly efficient deicers, Scientific Reports (2024). DOI: 10.1038/s41598-024-62942-y
Journal information: Scientific Reports
Read the original: