Skip to content

[SYCL] dot_product support. #2609

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Nov 5, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
148 changes: 148 additions & 0 deletions sycl/doc/extensions/SYCL_ONEAPI_dot_accumulate.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
= SYCL_INTEL_dot_accumulate
:source-highlighter: coderay
:coderay-linenums-mode: table
:doctype: book
:encoding: utf-8
:lang: en

:blank: pass:[ +]

// Set the default source code type in this document to C,
// for syntax highlighting purposes.
:language: c

// This is what is needed for C++, since docbook uses c++
// and everything else uses cpp. This doesn't work when
// source blocks are in table cells, though, so don't use
// C++ unless it is required.
//:language: {basebackend@docbook:c++:cpp}

== Introduction

IMPORTANT: This specification is a draft.

NOTE: Khronos(R) is a registered trademark and SYCL(TM) and SPIR(TM) are trademarks of The Khronos Group Inc. OpenCL(TM) is a trademark of Apple Inc. used by permission by Khronos.

NOTE: This document is better viewed when rendered as html with asciidoctor. GitHub does not render image icons.

== Name Strings

`SYCL_ONEAPI_dot_accumulate`

This is a placeholder name.

== Notice

Copyright (c) 2020 Intel Corporation. All rights reserved.

== Status

Working Draft

This is a preview extension specification, intended to provide early access to a feature for review and community feedback. When the feature matures, this specification may be released as a formal extension.

Because the interfaces defined by this specification are not final and are subject to change they are not intended to be used by shipping software products.

== Version

Built On: {docdate} +
Revision: B

== Contact

Ben Ashbaugh, Intel (ben 'dot' ashbaugh 'at' intel 'dot' com)

== Dependencies

This extension is written against the SYCL 1.2.1 specification, Revision v1.2.1-6.

== Overview

This extension adds new SYCL built-in functions that may simplify development and provide access specialized hardware instructions when a SYCL kernel needs to perform a dot product of two vectors followed by a scalar accumulation.

== Enabling the extension

The extension is always enabled. The dot product functionality may be emulated in software or executed using hardware when suitable instructions are available.

== Modifications of SYCL 1.2.1 specification

=== Add to Section 4.13.6 - Geometric Functions

Additionally, the following additional functions are available in the namespace `cl::sycl::intel` on the host and device.

[cols="4a,4",options="header"]
|====
| *Function*
| *Description*

|[source,c]
----
int32_t dot_acc(vec<int8_t,4> a,
vec<int8_t,4> b,
int32_t c)
int32_t dot_acc(vec<int8_t,4> a,
vec<uint8_t,4> b,
int32_t c)
int32_t dot_acc(vec<uint8_t,4> a,
vec<int8_t,4> b,
int32_t c)
int32_t dot_acc(vec<uint8_t,4> a,
vec<uint8_t,4> b,
int32_t c)
----

|Performs a four-component integer dot product accumulate operation. +
{blank}
The value that is returned is equivalent to +
{blank}
*dot*(_a_, _b_) + _c_

|====

== Sample Header

[source,c++]
----
namespace cl {
namespace sycl {
namespace ONEAPI {

int32_t dot_acc(vec<int8_t,4> a, vec<int8_t,4> b, int32_t c);
int32_t dot_acc(vec<int8_t,4> a, vec<uint8_t,4> b, int32_t c);
int32_t dot_acc(vec<uint8_t,4> a, vec<int8_t,4> b, int32_t c);
int32_t dot_acc(vec<uint8_t,4> a, vec<uint8_t,4> b, int32_t c);

int32_t dot_acc(int32_t a, int32_t b, int32_t c);
int32_t dot_acc(int32_t a, uint32_t b, int32_t c);
int32_t dot_acc(uint32_t a, int32_t b, int32_t c);
int32_t dot_acc(uint32_t a, uint32_t b, int32_t c);

} // ONEAPI
} // sycl
} // cl
----

== Issues

None.

== Revision History

[cols="5,15,15,70"]
[grid="rows"]
[options="header"]
|========================================
|Rev|Date|Author|Changes
|A|2019-12-13|Ben Ashbaugh|*Initial draft*
|B|2019-12-18|Ben Ashbaugh|Switched to standard C++ fixed width types.
|C|2020-10-26|Rajiv Deodhar|Added int32 types.
|========================================

//************************************************************************
//Other formatting suggestions:
//
//* Use *bold* text for host APIs, or [source] syntax highlighting.
//* Use `mono` text for device APIs, or [source] syntax highlighting.
//* Use `mono` text for extension names, types, or enum values.
//* Use _italics_ for parameters.
//************************************************************************
76 changes: 76 additions & 0 deletions sycl/include/CL/sycl/ONEAPI/dot_product.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
//==----------- dot_product.hpp ------- SYCL dot-product -------------------==//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

// DP4A extension

#pragma once

__SYCL_INLINE_NAMESPACE(cl) {
namespace sycl {
namespace ONEAPI {

union Us {
char s[4];
int32_t i;
};
union Uu {
unsigned char s[4];
uint32_t i;
};

int32_t dot_acc(int32_t pa, int32_t pb, int32_t c) {
Us a = *(reinterpret_cast<Us *>(&pa));
Us b = *(reinterpret_cast<Us *>(&pb));
return a.s[0] * b.s[0] + a.s[1] * b.s[1] + a.s[2] * b.s[2] + a.s[3] * b.s[3] +
c;
}

int32_t dot_acc(uint32_t pa, uint32_t pb, int32_t c) {
Uu a = *(reinterpret_cast<Uu *>(&pa));
Uu b = *(reinterpret_cast<Uu *>(&pb));
return a.s[0] * b.s[0] + a.s[1] * b.s[1] + a.s[2] * b.s[2] + a.s[3] * b.s[3] +
c;
}

int32_t dot_acc(int32_t pa, uint32_t pb, int32_t c) {
Us a = *(reinterpret_cast<Us *>(&pa));
Uu b = *(reinterpret_cast<Uu *>(&pb));
return a.s[0] * b.s[0] + a.s[1] * b.s[1] + a.s[2] * b.s[2] + a.s[3] * b.s[3] +
c;
}

int32_t dot_acc(uint32_t pa, int32_t pb, int32_t c) {
Uu a = *(reinterpret_cast<Uu *>(&pa));
Us b = *(reinterpret_cast<Us *>(&pb));
return a.s[0] * b.s[0] + a.s[1] * b.s[1] + a.s[2] * b.s[2] + a.s[3] * b.s[3] +
c;
}

int32_t dot_acc(vec<int8_t, 4> a, vec<int8_t, 4> b, int32_t c) {
return a.s0() * b.s0() + a.s1() * b.s1() + a.s2() * b.s2() + a.s3() * b.s3() +
c;
}

int32_t dot_acc(vec<uint8_t, 4> a, vec<uint8_t, 4> b, int32_t c) {
return a.s0() * b.s0() + a.s1() * b.s1() + a.s2() * b.s2() + a.s3() * b.s3() +
c;
}

int32_t dot_acc(vec<uint8_t, 4> a, vec<int8_t, 4> b, int32_t c) {
return a.s0() * b.s0() + a.s1() * b.s1() + a.s2() * b.s2() + a.s3() * b.s3() +
c;
}

int32_t dot_acc(vec<int8_t, 4> a, vec<uint8_t, 4> b, int32_t c) {
return a.s0() * b.s0() + a.s1() * b.s1() + a.s2() * b.s2() + a.s3() * b.s3() +
c;
}

} // namespace ONEAPI
} // namespace sycl
} // __SYCL_INLINE_NAMESPACE(cl)
Loading