Microsoft Releases Phi-3.5 AI Models
The launch of the brand new AI fashions was announced by Microsoft government Weizhu Chen in a put up on X (previously generally known as Twitter). The Phi-3.5 fashions provide upgraded capabilities over the predecessor, however the structure, dataset and coaching strategies largely stay the identical. The Mini mannequin has been up to date with multilingual assist, and the MoE and Vision fashions are new inclusions within the AI mannequin household.
Coming to technicalities, the Phi-3.5 Mini has 3.8 billion parameters. It makes use of the identical tokeniser (a device that breaks down textual content into smaller models) and a dense decoder-only transformer. The mannequin solely helps textual content as enter and helps a context window of 1,28,000 tokens. The firm claims it was skilled utilizing 3.4 trillion tokens between June and August, and its information cut-off is October 2023.
One key spotlight of this mannequin is that it now helps a number of new languages together with Arabic, Chinese, Czech, Danish, Dutch, English, Finnish, French, German, Hebrew, Hungarian, Italian, Japanese, Korean, Norwegian, Polish, Portuguese, Russian, Spanish, Swedish, Thai, Turkish, and Ukrainian.
The Phi-3.5 Vision AI mannequin has 4.2 billion parameters and it consists of a picture encoder that permits it to course of data inside a picture. With the identical context size because the Mini mannequin, it accepts each textual content and pictures as enter. It was skilled between July and August on 500 billion tokens of knowledge and has a textual content information cutoff of March.
Finally, the Phi-3.5 MoE AI mannequin has 16×3.8 billion parameters. However, solely 6.6 billion of them are lively parameters when utilizing two specialists. Notably, MoE is a way the place a number of fashions (specialists) are skilled independently after which mixed to enhance the accuracy and effectivity of the mannequin. This mannequin was skilled on 4.9 trillion tokens of knowledge between April and August, and it has a information cutoff date of October 2023.
On efficiency, Microsoft shared benchmark scores of the entire particular person fashions, and primarily based on the information shared, the Phi-3.5 MoE outperforms each Gemini 1.5 Flash and GPT-4o mini within the SQuALITY benchmark which assessments the readability and accuracy when summarising a protracted block of textual content. This assessments the lengthy context window of the AI mannequin.
However, it must be talked about that it isn’t a good comparability since MoE fashions use a special structure and require extra space for storing and extra subtle {hardware} to run. Separately, the Phi-3.5 Mini and Vision fashions have additionally outperformed related competing AI fashions in the identical section in some metrics.
Those eager about making an attempt out the Phi-3.5 AI fashions can entry them by way of Hugging Face listings. Microsoft mentioned that these fashions use flash consideration which would require customers to run the programs on superior GPUs. The firm has examined them on Nvidia A100, A6000, and H100 GPUs.