Jialian technology (Hong Kong) Co. , Ltd.

In October, IBM released its first Artificial Intelligent Unit (AIU) on-chip system.

This is an application-specific integrated circuit (ASIC) designed to train and run deep learning models that require large-scale parallel computing more quickly and efficiently.

AIU: designed for modern AI computing

Over the years, the industry has mostly used cpus and gpus to run deep learning models, but the number of AI models is growing exponentially.

At the same time, deep learning model is becoming more and more huge, with billions or even trillions of parameters, the need for more and more computing power, and CPU, GPU and other traditional architecture of CHIP AI computing growth has encountered a bottleneck.

The demand of deep neural network for computing power is increasing rapidly

According to IBM, the deep learning model has traditionally relied on a combination of cpus and GPU coprocessors to train and run the model.

The flexibility and precision of the CPU makes it ideal for general-purpose software applications, but it is at a disadvantage when it comes to training and running deep learning models that require large-scale, parallel AI operations.

The GPU was originally developed for rendering graphical images, but it has since discovered the advantages of using it in AI computing.

However, both cpus and gpus were designed before the Deep Learning Revolution, and their efficiency gains now lag behind the exponential growth of deep learning in computing power, what the industry really needs is a general-purpose chip optimized for the type of matrix and vector multiplication operations to carry out in-depth learning.

For the past five years, the IBM Research AI Hardware Center has focused on developing the next generation of chips and AI systems, hoping to increase the efficiency of AI Hardware by 2.5 times a year, and be able to train and run AI models at 2029 speeds 1,000 times faster than 2019.

The latest AIU chip is the first from IBM to be tailored to modern AI statistics.

According to IBM, the AIU was designed and optimized to accelerate the matrix and vector computations used in the deep learning model. AIU can solve computationally complex problems and perform data analysis much faster than the CPU can handle.

So how does IBM AIU optimize for deep learning? Answer: “Approximate computation”+ “Simplify AI workflow”

So how does IBM AIU optimize for deep learning? The answer is“Approximate computation”+ “Simplify AI workflow”.

Embrace low accuracy, using approximate calculation

Historically, many AI computations have relied on high-precision 64-bit and 32-bit floating-point operations. IBM believes that AI computing does not always require such precision.

It has a term that reduces the accuracy of traditional calculations-“Approximate computation”. In its blog, IBM explains the basics of using approximate computing:

“Do we need this kind of accuracy for common deep learning tasks? Do our brains need high-resolution images to recognize family members or cats? When we enter a text thread to search, do we need relative ranking accuracy for the 50,002 most useful replies versus the 50,003 most useful replies? The answer is that many of the tasks, including these examples, can be done by approximation.”

Based on this, IBM pioneered a technique called approximate computing that can be scaled down from 32-bit floating-point computing to a hybrid 8-bit floating-point (HFP8) computing format that contains a quarter of the information. This simplified format greatly reduces the amount of numerical computation required to train and run AI models without sacrificing accuracy.

A leaner bit format also reduces another drag on speed: less data needs to be moved into and out of memory, which means less memory for running AI models.

IBM has incorporated approximate computing techniques into the design of its new AIU chip, making the AIU chip's precision requirements considerably lower than those required by the CPU. Low accuracy is essential to achieve high computational density in the new AIU hardware accelerator.

AIU computes using a mixed 8-bit floating point (FP8) instead of the 32-bit floating point or 16-bit floating point operations typically used for AI training. The low-precision calculation makes the chip run 2 times faster than FP16 calculation, and provides similar training results.

While low-precision computing is necessary to achieve higher density and faster computation, the accuracy of the deep learning (DL) model must be consistent with that of the arbitrary-precision arithmetic.

Simplify AI workflow

Since most AI computing involves matrix and vector multiplication, the IBM AIU chip architecture has a simpler layout than multi-purpose cpus.

IBM AIU is also designed to save a lot of energy by sending data directly from one computing engine to another.

According to IBM, its AIU chip is a complete on-chip system, based on an expanded version of the proven AI accelerator built into IBM's previous Telum Chip (7nm process) , it uses a more advanced 5-NM process, has 32 processing cores and contains 23 billion transistors.

The IBM AIU is also designed to be as easy to use as a graphics card. It can plug into any computer or server with a PCIe slot.

“Deploying AI to categorize cats and dogs in photos is an interesting academic exercise,” IBM said. But it will not solve the pressing problems we face today. If we want Ai to solve real-world complexities -- like predicting the next Hurricane Ian, or whether we're headed for a recession -- we need industrial-grade hardware at the enterprise level. Our AIU brings that vision one step closer.”

How is IBM AIU doing?

IBM did not disclose more technical information about its AIU chips on its website. However, we can get a sense of IBM's performance by reviewing its initial 2021 demo at the International Solid State Circuits Conference (ISSCC) , where it presented the performance results of its early 7 nm chip designs.

The IBM prototype for the conference presentation was not a 32-core, but an experimental 4-core 7 nm AI chip that supported FP16 and hybrid FP8 formats for training and reasoning deep learning models.

It also supports the INT4 and INT2 formats for extended reasoning. The 2021 Lindley Group Newsletter, which includes a summary of the prototype's performance, reported IBM's demonstration:

At peak speeds, the 7nm chip achieves 1.9 teraflops (TF/W) per second using HFP 8.

Using INT4 reasoning, the experimental chip achieves 16.5 TOPS/w, which is superior to the high-pass low-power Cloud AI module.

Considering that the IBM AIU is an extended version of the test chip and the process has been upgraded to 5 nm, it is expected that the overall energy efficiency will be further improved as the number of cores increases from 4 cores to 32 cores, its overall peak computing power is expected to rise more than eightfold.

Analysts at Forbes say the lack of information makes it impossible to compare IBM's AIU with the GPU currently used for AI computing, but the chip is expected to cost between $1,500 and $2,000.

Pre：10 times silicon-based chip performance, the Chinese company to tackle graphene technology: is expected to break the monopoly
Next：没有了！

Cookie	Duration	Description
__ar_v4	1 year	This cookie is set under the domain DoubleClick, to place ads that point to the website in Google search results and to track conversion rates for these ads.
_fbp	3 months	This cookie is set by Facebook to display advertisements when either on Facebook or on a digital platform powered by Facebook advertising, after visiting the website.
adrl	1 month	This cookie is set by Outbrain and is used to register data on the visitor to optimize advertisement relevance.
anj	3 months	AppNexus sets the anj cookie that contains data stating whether a cookie ID is synced with partners.
c	1 year	This cookie is set by Rubicon Project to control synchronization of user identification and exchange of user data between various ad services.
CMID	1 year	Casale Media sets this cookie to collect information on user behavior, for targeted advertising.
CMPRO	3 months	CMPRO cookie is set by CasaleMedia for anonymous user tracking, and for targeted advertising.
CMPS	3 months	CMPS cookie is set by CasaleMedia for anonymous user tracking based on user's website visits, for displaying targeted ads.
CMRUM3	1 year	CMRUM3 cookie is set by CasaleMedia for anonymous user tracking based on user's website visits, for displaying targeted ads.
CMST	1 day	Casale Media sets this cookie to collect information on user behavior, for targeted advertising.
fr	3 months	Facebook sets this cookie to show relevant advertisements to users by tracking user behaviour across the web, on sites that have Facebook pixel or Facebook social plugin.
i	1 year	This cookie is set by OpenX to record anonymized user data, such as IP address, geographical location, websites visited, ads clicked by the user etc., for relevant advertising.
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
KRTBCOOKIE_10	3 months	This cookie, set by PubMatic, is used to build a profile of user interests and to show relevant ads.
PUBMDCID	3 months	PubMatic sets this cookie to store an ID that is used to display ads in the users' browser.
t_gid	1 year	Taboola sets this cookie by assigning a specific ID for attribution and reporting purposes and to tailor recommendations to the user.
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
tluid	3 months	This cookie is set by the provider AdRoll to identify and show the visitor relevant ads by collecting user data from multiple websites.
tuuid	1 year	The tuuid cookie, set by BidSwitch, stores an unique ID to determine what adverts the users have seen if they have visited any of the advertiser's websites. The information is used to decide when and how often users will see a certain banner.
tuuid_lu	1 year	This cookie, set by BidSwitch, stores a unique ID to determine what adverts the users have seen while visiting an advertiser's website. This information is then used to understand when and how often users will see a certain banner.
uuid2	3 months	The uuid2 cookie is set by AppNexus and records information that helps in differentiating between devices and browsers. This information is used to pick out ads delivered by the platform and assess the ad performance and its attribute payment.

Cookie	Duration	Description
__adroll	1 year 1 month	This cookie is set by AdRoll to identify users across visits and devices. It is used by real-time bidding for advertisers to display relevant advertisements.
__adroll_fpc	1 year	AdRoll sets this cookie to target users with advertisements based on their browsing behaviour.
__adroll_shared	1 year 1 month	Adroll sets this cookie to collect information on users across different websites for relevant advertising.
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_gat_UA-2008992-15	1 minute	A variation of the _gat cookie set by Google Analytics and Google Tag Manager to allow website owners to track visitor behaviour and measure site performance. The pattern element in the name contains the unique identity number of the account or website it relates to.
_gcl_au	3 months	Provided by Google Tag Manager to experiment advertisement efficiency of websites using their services.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
APID	1 year	This cookie is set by Yahoo to store information on how users behave on multiple websites so that relevant ads can be displayed to them.
CONSENT	2 years	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
pardot	past	The pardot cookie is set while the visitor is logged in as a Pardot user. The cookie indicates an active session and is not used for tracking.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
_zcsr_tmp	session	Zoho sets this cookie for the login function on the website.
bcookie	2 years	LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser ID.
bscookie	2 years	LinkedIn sets this cookie to store performed actions on the website.
lang	session	LinkedIn sets this cookie to remember a user's language setting.
li_gc	2 years	Used to store consent of guests regarding the use of cookies for non-essential purposes for Linkedin
lidc	1 day	LinkedIn sets the lidc cookie to facilitate data center selection.
PugT	1 month	PubMatic sets this cookie to check when the cookies were updated on the browser in order to limit the number of calls to the server-side cookie store.
UserMatchHistory	1 month	LinkedIn sets this cookie for LinkedIn Ads ID syncing.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.

Cookie	Duration	Description
_GRECAPTCHA	5 months 27 days	This cookie is set by the Google recaptcha service to identify bots to protect the website against malicious spam attacks.
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
intercom-id-wa2ugm89	8 months 26 days 1 hour	Cookie required for chat feature
intercom-session-wa2ugm89	7 days	Cookie required for chat feature
timezoneOffset	session	eturns the difference, in minutes, between date as evaluated in the UTC time zone, and date as evaluated in the local time zone — that is, the time zone of the host system in which the browser is being used (if the code is run from the Web in a browser), or otherwise the host system of whatever JavaScript runtime (for example, a Node.js environment) the code is executed in.
TS014c1515	session	Wix sets this cookie for security and anti-fraud purposes.
TS01db906f	session	Wix sets this cookie for security and anti-fraud purposes.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
wd-browser-id	session	These cookies are used to keep a user session alive over multiple web page views, and to keep the session aware of what activity is being performed in the session.
wday_vps_cookie	session	Cookie required for workday.
WorkdayLB_SAS	session	Forwards requests for a single session to the same server for consistency of service

Cookie	Duration	Description
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
obuid	3 months	Owned by Outbrain, this cookie holds anonymous user ID's and is used to understand what links and buttons are clicked on.

IBM AIU chip revealed: 5nm 32 core, 23 billion transistors!

More information