Code review - Issue 5754058: Support for Runtime CPU type detection via builtinshttps://codereview.appspot.com/2012-06-12T02:56:32+00:00rietveld
Message from unknown
2012-03-07T00:04:34+00:00Sriramanurn:md5:2bfe40c3e9a7419afab35cbf2ac3762c
Message from unknown
2012-03-07T00:48:55+00:00Sriramanurn:md5:445f1ed7ed079af5976f634b462ec045
Message from richard.guenther@gmail.com
2012-03-07T13:51:36+00:00richard.guenther_gmail.comurn:md5:d48eb50a930bfd9f2798dc414b7c4e27
On Wed, Mar 7, 2012 at 1:49 AM, Sriraman Tallam <tmsriram@google.com> wrote:
> Patch for CPU detection at run-time.
> ===================================
>
> Patch for CPU detection at run-time, to be used in dispatching of
> multi-versioned functions. Please see this discussion:
> http://gcc.gnu.org/ml/gcc-patches/2011-08/msg01355.html
> when this patch for reviewed the last time.
>
> For more detailed description:
> http://gcc.gnu.org/ml/gcc/2012-03/msg00074.html
>
> One of the main concerns was about making CPU detection initialization a
> constructor. The main point raised was about constructor ordering. I have
> added a priority value to the CPU detection constructor to make it very high
> priority so that it is guaranteed to fire before every constructor without
> an explicitly marked priority value of 101. However, IFUNC initializers
> will still fire before this constructor, so the cpu initialization routine
> has to be explicitly called in such initializers for which I have added a
> builtin: __builtin_cpu_init ().
>
> This patch adds the following new builtins:
>
> * __builtin_cpu_init
> * __builtin_cpu_supports_cmov
> * __builtin_cpu_supports_mmx
> * __builtin_cpu_supports_popcount
> * __builtin_cpu_supports_sse
> * __builtin_cpu_supports_sse2
> * __builtin_cpu_supports_sse3
> * __builtin_cpu_supports_ssse3
> * __builtin_cpu_supports_sse4_1
> * __builtin_cpu_supports_sse4_2
> * __builtin_cpu_is_amd
> * __builtin_cpu_is_intel_atom
> * __builtin_cpu_is_intel_core2
> * __builtin_cpu_is_intel
> * __builtin_cpu_is_intel_corei7
> * __builtin_cpu_is_intel_corei7_nehalem
> * __builtin_cpu_is_intel_corei7_westmere
> * __builtin_cpu_is_intel_corei7_sandybridge
> * __builtin_cpu_is_amdfam10
> * __builtin_cpu_is_amdfam10_barcelona
> * __builtin_cpu_is_amdfam10_shanghai
> * __builtin_cpu_is_amdfam10_istanbul
> * __builtin_cpu_is_amdfam15_bdver1
> * __builtin_cpu_is_amdfam15_bdver2
I think the non-feature detection functions are not necessary at all.
Builtin functions are not exactly cheap, nor is the scheme you invent
backward/forward compatible. Instead, why not add a single builtin
function, __builtin_cpu_supports(const char *), and decode from
a comma-separated list of features? Unknown features are simply
"not present". So I can write code with only a single configure check,
for __builtin_cpu_supports, and cater for future features or older compilers.
And of course that builtin would be even cross-platform.
Implementation-wise I'll leave this to x86 maintainers to comment on.
Richard.
>
> * config/i386/i386.c (build_struct_with_one_bit_fields): New function.
> (make_var_decl): New function.
> (get_field_from_struct): New function.
> (fold_builtin_target): New function.
> (ix86_fold_builtin): New function.
> (ix86_expand_builtin): Expand new builtins by folding them.
> (make_platform_builtin): New functions.
> (ix86_init_platform_type_builtins): Make the new builtins.
> (ix86_init_builtins): Make new builtins to detect CPU type.
> (TARGET_FOLD_BUILTIN): New macro.
> (IX86_BUILTIN_CPU_SUPPORTS_CMOV): New enum value.
> (IX86_BUILTIN_CPU_SUPPORTS_MMX): New enum value.
> (IX86_BUILTIN_CPU_SUPPORTS_POPCOUNT): New enum value.
> (IX86_BUILTIN_CPU_SUPPORTS_SSE): New enum value.
> (IX86_BUILTIN_CPU_SUPPORTS_SSE2): New enum value.
> (IX86_BUILTIN_CPU_SUPPORTS_SSE3): New enum value.
> (IX86_BUILTIN_CPU_SUPPORTS_SSSE3): New enum value.
> (IX86_BUILTIN_CPU_SUPPORTS_SSE4_1): New enum value.
> (IX86_BUILTIN_CPU_SUPPORTS_SSE4_2): New enum value.
> (IX86_BUILTIN_CPU_INIT): New enum value.
> (IX86_BUILTIN_CPU_IS_AMD): New enum value.
> (IX86_BUILTIN_CPU_IS_INTEL): New enum value.
> (IX86_BUILTIN_CPU_IS_INTEL_ATOM): New enum value.
> (IX86_BUILTIN_CPU_IS_INTEL_CORE2): New enum value.
> (IX86_BUILTIN_CPU_IS_INTEL_COREI7_NEHALEM): New enum value.
> (IX86_BUILTIN_CPU_IS_INTEL_COREI7_WESTMERE): New enum value.
> (IX86_BUILTIN_CPU_IS_INTEL_COREI7_SANDYBRIDGE): New enum value.
> (IX86_BUILTIN_CPU_IS_AMDFAM10_BARCELONA): New enum value.
> (IX86_BUILTIN_CPU_IS_AMDFAM10_SHANGHAI): New enum value.
> (IX86_BUILTIN_CPU_IS_AMDFAM10_ISTANBUL): New enum value.
> (IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER1): New enum value.
> (IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER2): New enum value.
> * config/i386/i386-builtin-types.def: New function type.
> * testsuite/gcc.target/builtin_target.c: New testcase.
>
> * libgcc/config/i386/i386-cpuinfo.c: New file.
> * libgcc/config/i386/t-cpuinfo: New file.
> * libgcc/config.host: Include t-cpuinfo.
> * libgcc/config/i386/libgcc-glibc.ver: Version symbols __cpu_model
> and __cpu_features.
>
> Index: libgcc/config.host
> ===================================================================
> --- libgcc/config.host (revision 184971)
> +++ libgcc/config.host (working copy)
> @@ -1142,7 +1142,7 @@ i[34567]86-*-linux* | x86_64-*-linux* | \
> i[34567]86-*-kfreebsd*-gnu | x86_64-*-kfreebsd*-gnu | \
> i[34567]86-*-knetbsd*-gnu | \
> i[34567]86-*-gnu*)
> - tmake_file="${tmake_file} t-tls i386/t-linux"
> + tmake_file="${tmake_file} t-tls i386/t-linux i386/t-cpuinfo"
> if test "$libgcc_cv_cfi" = "yes"; then
> tmake_file="${tmake_file} t-stack i386/t-stack-i386"
> fi
> Index: libgcc/config/i386/t-cpuinfo
> ===================================================================
> --- libgcc/config/i386/t-cpuinfo (revision 0)
> +++ libgcc/config/i386/t-cpuinfo (revision 0)
> @@ -0,0 +1 @@
> +LIB2ADD += $(srcdir)/config/i386/i386-cpuinfo.c
> Index: libgcc/config/i386/i386-cpuinfo.c
> ===================================================================
> --- libgcc/config/i386/i386-cpuinfo.c (revision 0)
> +++ libgcc/config/i386/i386-cpuinfo.c (revision 0)
> @@ -0,0 +1,306 @@
> +/* Get CPU type and Features for x86 processors.
> + Copyright (C) 2011 Free Software Foundation, Inc.
> + Contributed by Sriraman Tallam (tmsriram@google.com)
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify it under
> +the terms of the GNU General Public License as published by the Free
> +Software Foundation; either version 3, or (at your option) any later
> +version.
> +
> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
> +for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3. If not see
> +<http://www.gnu.org/licenses/>. */
> +
> +#include "cpuid.h"
> +#include "tsystem.h"
> +
> +int __cpu_indicator_init (void) __attribute__ ((constructor (101)));
> +
> +enum vendor_signatures
> +{
> + SIG_INTEL = 0x756e6547 /* Genu */,
> + SIG_AMD = 0x68747541 /* Auth */
> +};
> +
> +/* ISA Features supported. */
> +
> +struct __processor_features
> +{
> + unsigned int __cpu_cmov : 1;
> + unsigned int __cpu_mmx : 1;
> + unsigned int __cpu_popcnt : 1;
> + unsigned int __cpu_sse : 1;
> + unsigned int __cpu_sse2 : 1;
> + unsigned int __cpu_sse3 : 1;
> + unsigned int __cpu_ssse3 : 1;
> + unsigned int __cpu_sse4_1 : 1;
> + unsigned int __cpu_sse4_2 : 1;
> +} __cpu_features;
> +
> +/* Processor Model. */
> +
> +struct __processor_model
> +{
> + /* Vendor. */
> + unsigned int __cpu_is_amd : 1;
> + unsigned int __cpu_is_intel : 1;
> + /* CPU type. */
> + unsigned int __cpu_is_intel_atom : 1;
> + unsigned int __cpu_is_intel_core2 : 1;
> + unsigned int __cpu_is_intel_corei7 : 1;
> + unsigned int __cpu_is_intel_corei7_nehalem : 1;
> + unsigned int __cpu_is_intel_corei7_westmere : 1;
> + unsigned int __cpu_is_intel_corei7_sandybridge : 1;
> + unsigned int __cpu_is_amdfam10h : 1;
> + unsigned int __cpu_is_amdfam10h_barcelona : 1;
> + unsigned int __cpu_is_amdfam10h_shanghai : 1;
> + unsigned int __cpu_is_amdfam10h_istanbul : 1;
> + unsigned int __cpu_is_amdfam15h_bdver1 : 1;
> + unsigned int __cpu_is_amdfam15h_bdver2 : 1;
> +} __cpu_model;
> +
> +/* Get the specific type of AMD CPU. */
> +
> +static void
> +get_amd_cpu (unsigned int family, unsigned int model)
> +{
> + switch (family)
> + {
> + /* AMD Family 10h. */
> + case 0x10:
> + switch (model)
> + {
> + case 0x2:
> + /* Barcelona. */
> + __cpu_model.__cpu_is_amdfam10h = 1;
> + __cpu_model.__cpu_is_amdfam10h_barcelona = 1;
> + break;
> + case 0x4:
> + /* Shanghai. */
> + __cpu_model.__cpu_is_amdfam10h = 1;
> + __cpu_model.__cpu_is_amdfam10h_shanghai = 1;
> + break;
> + case 0x8:
> + /* Istanbul. */
> + __cpu_model.__cpu_is_amdfam10h = 1;
> + __cpu_model.__cpu_is_amdfam10h_istanbul = 1;
> + break;
> + default:
> + break;
> + }
> + break;
> + /* AMD Family 15h. */
> + case 0x15:
> + /* Bulldozer version 1. */
> + if (model >= 0 && model <= 0xf)
> + __cpu_model.__cpu_is_amdfam15h_bdver1 = 1;
> + /* Bulldozer version 2. */
> + if (model >= 0x10 && model <= 0x1f)
> + __cpu_model.__cpu_is_amdfam15h_bdver2 = 1;
> + break;
> + default:
> + break;
> + }
> +}
> +
> +/* Get the specific type of Intel CPU. */
> +
> +static void
> +get_intel_cpu (unsigned int family, unsigned int model, unsigned int brand_id)
> +{
> + /* Parse family and model only if brand ID is 0. */
> + if (brand_id == 0)
> + {
> + switch (family)
> + {
> + case 0x5:
> + /* Pentium. */
> + break;
> + case 0x6:
> + switch (model)
> + {
> + case 0x1c:
> + case 0x26:
> + /* Atom. */
> + __cpu_model.__cpu_is_intel_atom = 1;
> + break;
> + case 0x1a:
> + case 0x1e:
> + case 0x1f:
> + case 0x2e:
> + /* Nehalem. */
> + __cpu_model.__cpu_is_intel_corei7 = 1;
> + __cpu_model.__cpu_is_intel_corei7_nehalem = 1;
> + break;
> + case 0x25:
> + case 0x2c:
> + case 0x2f:
> + /* Westmere. */
> + __cpu_model.__cpu_is_intel_corei7 = 1;
> + __cpu_model.__cpu_is_intel_corei7_westmere = 1;
> + break;
> + case 0x2a:
> + /* Sandy Bridge. */
> + __cpu_model.__cpu_is_intel_corei7 = 1;
> + __cpu_model.__cpu_is_intel_corei7_sandybridge = 1;
> + break;
> + case 0x17:
> + case 0x1d:
> + /* Penryn. */
> + case 0x0f:
> + /* Merom. */
> + __cpu_model.__cpu_is_intel_core2 = 1;
> + break;
> + default:
> + break;
> + }
> + break;
> + default:
> + /* We have no idea. */
> + break;
> + }
> + }
> +}
> +
> +static void
> +get_available_features (unsigned int ecx, unsigned int edx)
> +{
> + __cpu_features.__cpu_cmov = (edx & bit_CMOV) ? 1 : 0;
> + __cpu_features.__cpu_mmx = (edx & bit_MMX) ? 1 : 0;
> + __cpu_features.__cpu_sse = (edx & bit_SSE) ? 1 : 0;
> + __cpu_features.__cpu_sse2 = (edx & bit_SSE2) ? 1 : 0;
> + __cpu_features.__cpu_popcnt = (ecx & bit_POPCNT) ? 1 : 0;
> + __cpu_features.__cpu_sse3 = (ecx & bit_SSE3) ? 1 : 0;
> + __cpu_features.__cpu_ssse3 = (ecx & bit_SSSE3) ? 1 : 0;
> + __cpu_features.__cpu_sse4_1 = (ecx & bit_SSE4_1) ? 1 : 0;
> + __cpu_features.__cpu_sse4_2 = (ecx & bit_SSE4_2) ? 1 : 0;
> +}
> +
> +
> +/* Sanity check for the vendor and cpu type flags. */
> +
> +static int
> +sanity_check (void)
> +{
> + unsigned int one_type = 0;
> +
> + /* Vendor cannot be Intel and AMD. */
> + gcc_assert((__cpu_model.__cpu_is_intel == 0)
> + || (__cpu_model.__cpu_is_amd == 0));
> +
> + /* Only one CPU type can be set. */
> + one_type = (__cpu_model.__cpu_is_intel_atom
> + + __cpu_model.__cpu_is_intel_core2
> + + __cpu_model.__cpu_is_intel_corei7_nehalem
> + + __cpu_model.__cpu_is_intel_corei7_westmere
> + + __cpu_model.__cpu_is_intel_corei7_sandybridge
> + + __cpu_model.__cpu_is_amdfam10h_barcelona
> + + __cpu_model.__cpu_is_amdfam10h_shanghai
> + + __cpu_model.__cpu_is_amdfam10h_istanbul
> + + __cpu_model.__cpu_is_amdfam15h_bdver1
> + + __cpu_model.__cpu_is_amdfam15h_bdver2);
> +
> + gcc_assert (one_type <= 1);
> + return 0;
> +}
> +
> +/* A noinline function calling __get_cpuid. Having many calls to
> + cpuid in one function in 32-bit mode causes GCC to complain:
> + "can’t find a register in class ‘CLOBBERED_REGS’". This is
> + related to PR rtl-optimization 44174. */
> +
> +static int __attribute__ ((noinline))
> +__get_cpuid_output (unsigned int __level,
> + unsigned int *__eax, unsigned int *__ebx,
> + unsigned int *__ecx, unsigned int *__edx)
> +{
> + return __get_cpuid (__level, __eax, __ebx, __ecx, __edx);
> +}
> +
> +
> +/* A constructor function that is sets __cpu_model and __cpu_features with
> + the right values. This needs to run only once. This constructor is
> + given the highest priority and it should run before constructors without
> + the priority set. However, it still runs after ifunc initializers and
> + needs to be called explicitly there. */
> +
> +int __attribute__ ((constructor (101)))
> +__cpu_indicator_init (void)
> +{
> + unsigned int eax, ebx, ecx, edx;
> +
> + int max_level = 5;
> + unsigned int vendor;
> + unsigned int model, family, brand_id;
> + unsigned int extended_model, extended_family;
> + static int called = 0;
> +
> + /* This function needs to run just once. */
> + if (called)
> + return 0;
> + else
> + called = 1;
> +
> + /* Assume cpuid insn present. Run in level 0 to get vendor id. */
> + if (!__get_cpuid_output (0, &eax, &ebx, &ecx, &edx))
> + return -1;
> +
> + vendor = ebx;
> + max_level = eax;
> +
> + if (max_level < 1)
> + return -1;
> +
> + if (!__get_cpuid_output (1, &eax, &ebx, &ecx, &edx))
> + return -1;
> +
> + model = (eax >> 4) & 0x0f;
> + family = (eax >> 8) & 0x0f;
> + brand_id = ebx & 0xff;
> + extended_model = (eax >> 12) & 0xf0;
> + extended_family = (eax >> 20) & 0xff;
> +
> + if (vendor == SIG_INTEL)
> + {
> + /* Adjust model and family for Intel CPUS. */
> + if (family == 0x0f)
> + {
> + family += extended_family;
> + model += extended_model;
> + }
> + else if (family == 0x06)
> + model += extended_model;
> +
> + /* Get CPU type. */
> + __cpu_model.__cpu_is_intel = 1;
> + get_intel_cpu (family, model, brand_id);
> + }
> +
> + if (vendor == SIG_AMD)
> + {
> + /* Adjust model and family for AMD CPUS. */
> + if (family == 0x0f)
> + {
> + family += extended_family;
> + model += (extended_model << 4);
> + }
> +
> + /* Get CPU type. */
> + __cpu_model.__cpu_is_amd = 1;
> + get_amd_cpu (family, model);
> + }
> +
> + /* Find available features. */
> + get_available_features (ecx, edx);
> +
> + sanity_check ();
> +
> + return 0;
> +}
> Index: libgcc/config/i386/libgcc-glibc.ver
> ===================================================================
> --- libgcc/config/i386/libgcc-glibc.ver (revision 184971)
> +++ libgcc/config/i386/libgcc-glibc.ver (working copy)
> @@ -147,6 +147,11 @@ GCC_4.3.0 {
> __trunctfxf2
> __unordtf2
> }
> +
> +GCC_4.8.0 {
> + __cpu_model
> + __cpu_features
> +}
> %else
> GCC_4.4.0 {
> __addtf3
> @@ -183,4 +188,9 @@ GCC_4.4.0 {
> GCC_4.5.0 {
> __extendxftf2
> }
> +
> +GCC_4.8.0 {
> + __cpu_model
> + __cpu_features
> +}
> %endif
> Index: gcc/config/i386/i386-builtin-types.def
> ===================================================================
> --- gcc/config/i386/i386-builtin-types.def (revision 184971)
> +++ gcc/config/i386/i386-builtin-types.def (working copy)
> @@ -143,6 +143,7 @@ DEF_FUNCTION_TYPE (UINT64)
> DEF_FUNCTION_TYPE (UNSIGNED)
> DEF_FUNCTION_TYPE (VOID)
> DEF_FUNCTION_TYPE (PVOID)
> +DEF_FUNCTION_TYPE (INT)
>
> DEF_FUNCTION_TYPE (FLOAT, FLOAT)
> DEF_FUNCTION_TYPE (FLOAT128, FLOAT128)
> Index: gcc/config/i386/i386.c
> ===================================================================
> --- gcc/config/i386/i386.c (revision 184971)
> +++ gcc/config/i386/i386.c (working copy)
> @@ -25637,6 +25637,33 @@ enum ix86_builtins
> /* CFString built-in for darwin */
> IX86_BUILTIN_CFSTRING,
>
> + /* Builtins to get CPU features. */
> + IX86_BUILTIN_CPU_SUPPORTS_CMOV,
> + IX86_BUILTIN_CPU_SUPPORTS_MMX,
> + IX86_BUILTIN_CPU_SUPPORTS_POPCOUNT,
> + IX86_BUILTIN_CPU_SUPPORTS_SSE,
> + IX86_BUILTIN_CPU_SUPPORTS_SSE2,
> + IX86_BUILTIN_CPU_SUPPORTS_SSE3,
> + IX86_BUILTIN_CPU_SUPPORTS_SSSE3,
> + IX86_BUILTIN_CPU_SUPPORTS_SSE4_1,
> + IX86_BUILTIN_CPU_SUPPORTS_SSE4_2,
> + /* Builtins to get CPU type. */
> + IX86_BUILTIN_CPU_INIT,
> + IX86_BUILTIN_CPU_IS_AMD,
> + IX86_BUILTIN_CPU_IS_INTEL,
> + IX86_BUILTIN_CPU_IS_INTEL_ATOM,
> + IX86_BUILTIN_CPU_IS_INTEL_CORE2,
> + IX86_BUILTIN_CPU_IS_INTEL_COREI7,
> + IX86_BUILTIN_CPU_IS_INTEL_COREI7_NEHALEM,
> + IX86_BUILTIN_CPU_IS_INTEL_COREI7_WESTMERE,
> + IX86_BUILTIN_CPU_IS_INTEL_COREI7_SANDYBRIDGE,
> + IX86_BUILTIN_CPU_IS_AMDFAM10H,
> + IX86_BUILTIN_CPU_IS_AMDFAM10H_BARCELONA,
> + IX86_BUILTIN_CPU_IS_AMDFAM10H_SHANGHAI,
> + IX86_BUILTIN_CPU_IS_AMDFAM10H_ISTANBUL,
> + IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER1,
> + IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER2,
> +
> IX86_BUILTIN_MAX
> };
>
> @@ -27446,6 +27473,593 @@ ix86_init_mmx_sse_builtins (void)
> }
> }
>
> +/* Returns a struct type with name NAME and number of fields equal to
> + NUM_FIELDS. Each field is a unsigned int bit field of length 1 bit. */
> +
> +static tree
> +build_struct_with_one_bit_fields (int num_fields, const char *name)
> +{
> + int i;
> + char field_name [10];
> + tree field = NULL_TREE, field_chain = NULL_TREE;
> + tree type = make_node (RECORD_TYPE);
> +
> + strcpy (field_name, "k_field");
> +
> + for (i = 0; i < num_fields; i++)
> + {
> + /* Name the fields, 0_field, 1_field, ... */
> + field_name [0] = '0' + i;
> + field = build_decl (UNKNOWN_LOCATION, FIELD_DECL,
> + get_identifier (field_name), unsigned_type_node);
> + DECL_BIT_FIELD (field) = 1;
> + DECL_SIZE (field) = bitsize_one_node;
> + if (field_chain != NULL_TREE)
> + DECL_CHAIN (field) = field_chain;
> + field_chain = field;
> + }
> + finish_builtin_struct (type, name, field_chain, NULL_TREE);
> + return type;
> +}
> +
> +/* Returns a extern, comdat VAR_DECL of type TYPE and name NAME. */
> +
> +static tree
> +make_var_decl (tree type, const char *name)
> +{
> + tree new_decl;
> + struct varpool_node *vnode;
> +
> + new_decl = build_decl (UNKNOWN_LOCATION,
> + VAR_DECL,
> + get_identifier(name),
> + type);
> +
> + DECL_EXTERNAL (new_decl) = 1;
> + TREE_STATIC (new_decl) = 1;
> + TREE_PUBLIC (new_decl) = 1;
> + DECL_INITIAL (new_decl) = 0;
> + DECL_ARTIFICIAL (new_decl) = 0;
> + DECL_PRESERVE_P (new_decl) = 1;
> +
> + make_decl_one_only (new_decl, DECL_ASSEMBLER_NAME (new_decl));
> + assemble_variable (new_decl, 0, 0, 0);
> +
> + vnode = varpool_node (new_decl);
> + gcc_assert (vnode != NULL);
> + /* Set finalized to 1, otherwise it asserts in function "write_symbol" in
> + lto-streamer-out.c. */
> + vnode->finalized = 1;
> +
> + return new_decl;
> +}
> +
> +/* Traverses the chain of fields in STRUCT_TYPE and returns the FIELD_NUM
> + numbered field. */
> +
> +static tree
> +get_field_from_struct (tree struct_type, int field_num)
> +{
> + int i;
> + tree field = TYPE_FIELDS (struct_type);
> +
> + for (i = 0; i < field_num; i++, field = DECL_CHAIN(field))
> + {
> + gcc_assert (field != NULL_TREE);
> + }
> +
> + return field;
> +}
> +
> +/* FNDECL is a __builtin_cpu_* call that is folded into an integer defined
> + in libgcc/config/i386/i386-cpuinfo.c */
> +
> +static tree
> +fold_builtin_cpu (enum ix86_builtins fn_code)
> +{
> + /* This is the order of bit-fields in __processor_features in
> + i386-cpuinfo.c */
> + enum processor_features
> + {
> + F_CMOV = 0,
> + F_MMX,
> + F_POPCNT,
> + F_SSE,
> + F_SSE2,
> + F_SSE3,
> + F_SSSE3,
> + F_SSE4_1,
> + F_SSE4_2,
> + F_MAX
> + };
> +
> + /* This is the order of bit-fields in __processor_model in
> + i386-cpuinfo.c */
> + enum processor_model
> + {
> + M_AMD = 0,
> + M_INTEL,
> + M_INTEL_ATOM,
> + M_INTEL_CORE2,
> + M_INTEL_COREI7,
> + M_INTEL_COREI7_NEHALEM,
> + M_INTEL_COREI7_WESTMERE,
> + M_INTEL_COREI7_SANDYBRIDGE,
> + M_AMDFAM10H,
> + M_AMDFAM10H_BARCELONA,
> + M_AMDFAM10H_SHANGHAI,
> + M_AMDFAM10H_ISTANBUL,
> + M_AMDFAM15H_BDVER1,
> + M_AMDFAM15H_BDVER2,
> + M_MAX
> + };
> +
> + static tree __processor_features_type = NULL_TREE;
> + static tree __cpu_features_var = NULL_TREE;
> + static tree __processor_model_type = NULL_TREE;
> + static tree __cpu_model_var = NULL_TREE;
> + static tree field;
> + static tree which_struct;
> +
> + if (__processor_features_type == NULL_TREE)
> + __processor_features_type = build_struct_with_one_bit_fields (F_MAX,
> + "__processor_features");
> +
> + if (__processor_model_type == NULL_TREE)
> + __processor_model_type = build_struct_with_one_bit_fields (M_MAX,
> + "__processor_model");
> +
> + if (__cpu_features_var == NULL_TREE)
> + __cpu_features_var = make_var_decl (__processor_features_type,
> + "__cpu_features");
> +
> + if (__cpu_model_var == NULL_TREE)
> + __cpu_model_var = make_var_decl (__processor_model_type,
> + "__cpu_model");
> +
> + /* Look at the code to identify the field requested. */
> + switch (fn_code)
> + {
> + case IX86_BUILTIN_CPU_SUPPORTS_CMOV:
> + field = get_field_from_struct (__processor_features_type, F_CMOV);
> + which_struct = __cpu_features_var;
> + break;
> + case IX86_BUILTIN_CPU_SUPPORTS_MMX:
> + field = get_field_from_struct (__processor_features_type, F_MMX);
> + which_struct = __cpu_features_var;
> + break;
> + case IX86_BUILTIN_CPU_SUPPORTS_POPCOUNT:
> + field = get_field_from_struct (__processor_features_type, F_POPCNT);
> + which_struct = __cpu_features_var;
> + break;
> + case IX86_BUILTIN_CPU_SUPPORTS_SSE:
> + field = get_field_from_struct (__processor_features_type, F_SSE);
> + which_struct = __cpu_features_var;
> + break;
> + case IX86_BUILTIN_CPU_SUPPORTS_SSE2:
> + field = get_field_from_struct (__processor_features_type, F_SSE2);
> + which_struct = __cpu_features_var;
> + break;
> + case IX86_BUILTIN_CPU_SUPPORTS_SSE3:
> + field = get_field_from_struct (__processor_features_type, F_SSE3);
> + which_struct = __cpu_features_var;
> + break;
> + case IX86_BUILTIN_CPU_SUPPORTS_SSSE3:
> + field = get_field_from_struct (__processor_features_type, F_SSSE3);
> + which_struct = __cpu_features_var;
> + break;
> + case IX86_BUILTIN_CPU_SUPPORTS_SSE4_1:
> + field = get_field_from_struct (__processor_features_type, F_SSE4_1);
> + which_struct = __cpu_features_var;
> + break;
> + case IX86_BUILTIN_CPU_SUPPORTS_SSE4_2:
> + field = get_field_from_struct (__processor_features_type, F_SSE4_2);
> + which_struct = __cpu_features_var;
> + break;
> + case IX86_BUILTIN_CPU_IS_AMD:
> + field = get_field_from_struct (__processor_model_type, M_AMD);
> + which_struct = __cpu_model_var;
> + break;
> + case IX86_BUILTIN_CPU_IS_INTEL:
> + field = get_field_from_struct (__processor_model_type, M_INTEL);
> + which_struct = __cpu_model_var;
> + break;
> + case IX86_BUILTIN_CPU_IS_INTEL_ATOM:
> + field = get_field_from_struct (__processor_model_type, M_INTEL_ATOM);
> + which_struct = __cpu_model_var;
> + break;
> + case IX86_BUILTIN_CPU_IS_INTEL_CORE2:
> + field = get_field_from_struct (__processor_model_type, M_INTEL_CORE2);
> + which_struct = __cpu_model_var;
> + break;
> + case IX86_BUILTIN_CPU_IS_INTEL_COREI7:
> + field = get_field_from_struct (__processor_model_type,
> + M_INTEL_COREI7);
> + which_struct = __cpu_model_var;
> + break;
> + case IX86_BUILTIN_CPU_IS_INTEL_COREI7_NEHALEM:
> + field = get_field_from_struct (__processor_model_type,
> + M_INTEL_COREI7_NEHALEM);
> + which_struct = __cpu_model_var;
> + break;
> + case IX86_BUILTIN_CPU_IS_INTEL_COREI7_WESTMERE:
> + field = get_field_from_struct (__processor_model_type,
> + M_INTEL_COREI7_WESTMERE);
> + which_struct = __cpu_model_var;
> + break;
> + case IX86_BUILTIN_CPU_IS_INTEL_COREI7_SANDYBRIDGE:
> + field = get_field_from_struct (__processor_model_type,
> + M_INTEL_COREI7_SANDYBRIDGE);
> + which_struct = __cpu_model_var;
> + break;
> + case IX86_BUILTIN_CPU_IS_AMDFAM10H:
> + field = get_field_from_struct (__processor_model_type,
> + M_AMDFAM10H);
> + which_struct = __cpu_model_var;
> + break;
> + case IX86_BUILTIN_CPU_IS_AMDFAM10H_BARCELONA:
> + field = get_field_from_struct (__processor_model_type,
> + M_AMDFAM10H_BARCELONA);
> + which_struct = __cpu_model_var;
> + break;
> + case IX86_BUILTIN_CPU_IS_AMDFAM10H_SHANGHAI:
> + field = get_field_from_struct (__processor_model_type,
> + M_AMDFAM10H_SHANGHAI);
> + which_struct = __cpu_model_var;
> + break;
> + case IX86_BUILTIN_CPU_IS_AMDFAM10H_ISTANBUL:
> + field = get_field_from_struct (__processor_model_type,
> + M_AMDFAM10H_ISTANBUL);
> + which_struct = __cpu_model_var;
> + break;
> + case IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER1:
> + field = get_field_from_struct (__processor_model_type,
> + M_AMDFAM15H_BDVER1);
> + which_struct = __cpu_model_var;
> + break;
> + case IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER2:
> + field = get_field_from_struct (__processor_model_type,
> + M_AMDFAM15H_BDVER2);
> + which_struct = __cpu_model_var;
> + break;
> + default:
> + return NULL_TREE;
> + }
> +
> + return build3 (COMPONENT_REF, TREE_TYPE (field), which_struct, field, NULL_TREE);
> +}
> +
> +static tree
> +ix86_fold_builtin (tree fndecl, int n_args ATTRIBUTE_UNUSED,
> + tree *args ATTRIBUTE_UNUSED, bool ignore ATTRIBUTE_UNUSED)
> +{
> + const char* decl_name = IDENTIFIER_POINTER (DECL_NAME (fndecl));
> + if (DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_MD
> + && strstr(decl_name, "__builtin_cpu") != NULL)
> + {
> + enum ix86_builtins code = (enum ix86_builtins)
> + DECL_FUNCTION_CODE (fndecl);
> + return fold_builtin_cpu (code);
> + }
> + return NULL_TREE;
> +}
> +
> +/* A builtin to init/return the cpu type or feature. Returns an
> + integer and the type is a const if IS_CONST is set. */
> +
> +static void
> +make_platform_builtin (const char* name, int code, int is_const)
> +{
> + tree decl;
> + tree type;
> +
> + type = ix86_get_builtin_func_type (INT_FTYPE_VOID);
> + decl = add_builtin_function (name, type, code, BUILT_IN_MD,
> + NULL, NULL_TREE);
> + gcc_assert (decl != NULL_TREE);
> + ix86_builtins[(int) code] = decl;
> + if (is_const)
> + TREE_READONLY (decl) = 1;
> +}
> +
> +/* Builtins to get CPU type and features supported. */
> +
> +static void
> +ix86_init_platform_type_builtins (void)
> +{
> + make_platform_builtin ("__builtin_cpu_init",
> + IX86_BUILTIN_CPU_INIT, 0);
> + make_platform_builtin ("__builtin_cpu_supports_cmov",
> + IX86_BUILTIN_CPU_SUPPORTS_CMOV, 1);
> + make_platform_builtin ("__builtin_cpu_supports_mmx",
> + IX86_BUILTIN_CPU_SUPPORTS_MMX, 1);
> + make_platform_builtin ("__builtin_cpu_supports_popcount",
> + IX86_BUILTIN_CPU_SUPPORTS_POPCOUNT, 1);
> + make_platform_builtin ("__builtin_cpu_supports_sse",
> + IX86_BUILTIN_CPU_SUPPORTS_SSE, 1);
> + make_platform_builtin ("__builtin_cpu_supports_sse2",
> + IX86_BUILTIN_CPU_SUPPORTS_SSE2, 1);
> + make_platform_builtin ("__builtin_cpu_supports_sse3",
> + IX86_BUILTIN_CPU_SUPPORTS_SSE3, 1);
> + make_platform_builtin ("__builtin_cpu_supports_ssse3",
> + IX86_BUILTIN_CPU_SUPPORTS_SSSE3, 1);
> + make_platform_builtin ("__builtin_cpu_supports_sse4_1",
> + IX86_BUILTIN_CPU_SUPPORTS_SSE4_1, 1);
> + make_platform_builtin ("__builtin_cpu_supports_sse4_2",
> + IX86_BUILTIN_CPU_SUPPORTS_SSE4_2, 1);
> + make_platform_builtin ("__builtin_cpu_is_amd",
> + IX86_BUILTIN_CPU_IS_AMD, 1);
> + make_platform_builtin ("__builtin_cpu_is_intel_atom",
> + IX86_BUILTIN_CPU_IS_INTEL_ATOM, 1);
> + make_platform_builtin ("__builtin_cpu_is_intel_core2",
> + IX86_BUILTIN_CPU_IS_INTEL_CORE2, 1);
> + make_platform_builtin ("__builtin_cpu_is_intel",
> + IX86_BUILTIN_CPU_IS_INTEL, 1);
> + make_platform_builtin ("__builtin_cpu_is_intel_corei7",
> + IX86_BUILTIN_CPU_IS_INTEL_COREI7, 1);
> + make_platform_builtin ("__builtin_cpu_is_intel_corei7_nehalem",
> + IX86_BUILTIN_CPU_IS_INTEL_COREI7_NEHALEM, 1);
> + make_platform_builtin ("__builtin_cpu_is_intel_corei7_westmere",
> + IX86_BUILTIN_CPU_IS_INTEL_COREI7_WESTMERE, 1);
> + make_platform_builtin ("__builtin_cpu_is_intel_corei7_sandybridge",
> + IX86_BUILTIN_CPU_IS_INTEL_COREI7_SANDYBRIDGE, 1);
> + make_platform_builtin ("__builtin_cpu_is_amdfam10",
> + IX86_BUILTIN_CPU_IS_AMDFAM10H, 1);
> + make_platform_builtin ("__builtin_cpu_is_amdfam10_barcelona",
> + IX86_BUILTIN_CPU_IS_AMDFAM10H_BARCELONA, 1);
> + make_platform_builtin ("__builtin_cpu_is_amdfam10_shanghai",
> + IX86_BUILTIN_CPU_IS_AMDFAM10H_SHANGHAI, 1);
> + make_platform_builtin ("__builtin_cpu_is_amdfam10_istanbul",
> + IX86_BUILTIN_CPU_IS_AMDFAM10H_ISTANBUL, 1);
> + make_platform_builtin ("__builtin_cpu_is_amdfam15_bdver1",
> + IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER1, 1);
> + make_platform_builtin ("__builtin_cpu_is_amdfam15_bdver2",
> + IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER2, 1);
> +}
> +
> /* Internal method for ix86_init_builtins. */
>
> static void
> @@ -27529,6 +28143,9 @@ ix86_init_builtins (void)
>
> ix86_init_builtin_types ();
>
> + /* Builtins to get CPU type and features. */
> + ix86_init_platform_type_builtins ();
> +
> /* TFmode support builtins. */
> def_builtin_const (0, "__builtin_infq",
> FLOAT128_FTYPE_VOID, IX86_BUILTIN_INFQ);
> @@ -29145,6 +29762,48 @@ ix86_expand_builtin (tree exp, rtx target, rtx sub
> enum machine_mode mode0, mode1, mode2, mode3, mode4;
> unsigned int fcode = DECL_FUNCTION_CODE (fndecl);
>
> + /* For CPU builtins that can be folded, fold first and expand the fold. */
> + switch (fcode)
> + {
> + case IX86_BUILTIN_CPU_SUPPORTS_CMOV:
> + case IX86_BUILTIN_CPU_SUPPORTS_MMX:
> + case IX86_BUILTIN_CPU_SUPPORTS_POPCOUNT:
> + case IX86_BUILTIN_CPU_SUPPORTS_SSE:
> + case IX86_BUILTIN_CPU_SUPPORTS_SSE2:
> + case IX86_BUILTIN_CPU_SUPPORTS_SSE3:
> + case IX86_BUILTIN_CPU_SUPPORTS_SSSE3:
> + case IX86_BUILTIN_CPU_SUPPORTS_SSE4_1:
> + case IX86_BUILTIN_CPU_SUPPORTS_SSE4_2:
> + case IX86_BUILTIN_CPU_IS_AMD:
> + case IX86_BUILTIN_CPU_IS_INTEL:
> + case IX86_BUILTIN_CPU_IS_INTEL_ATOM:
> + case IX86_BUILTIN_CPU_IS_INTEL_CORE2:
> + case IX86_BUILTIN_CPU_IS_INTEL_COREI7:
> + case IX86_BUILTIN_CPU_IS_INTEL_COREI7_NEHALEM:
> + case IX86_BUILTIN_CPU_IS_INTEL_COREI7_WESTMERE:
> + case IX86_BUILTIN_CPU_IS_INTEL_COREI7_SANDYBRIDGE:
> + case IX86_BUILTIN_CPU_IS_AMDFAM10H:
> + case IX86_BUILTIN_CPU_IS_AMDFAM10H_BARCELONA:
> + case IX86_BUILTIN_CPU_IS_AMDFAM10H_SHANGHAI:
> + case IX86_BUILTIN_CPU_IS_AMDFAM10H_ISTANBUL:
> + case IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER1:
> + case IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER2:
> + {
> + tree fold_expr = fold_builtin_cpu ((enum ix86_builtins) fcode);
> + gcc_assert (fold_expr != NULL_TREE);
> + return expand_expr (fold_expr, target, mode, EXPAND_NORMAL);
> + }
> + case IX86_BUILTIN_CPU_INIT:
> + {
> + /* Make it call __cpu_indicator_init in libgcc. */
> + tree call_expr, fndecl, type;
> + type = build_function_type_list (integer_type_node, NULL_TREE);
> + fndecl = build_fn_decl ("__cpu_indicator_init", type);
> + call_expr = build_call_expr (fndecl, 0);
> + return expand_expr (call_expr, target, mode, EXPAND_NORMAL);
> + }
> + }
> +
> /* Determine whether the builtin function is available under the current ISA.
> Originally the builtin was not created if it wasn't applicable to the
> current ISA based on the command line switches. With function specific
> @@ -38610,6 +39269,12 @@ ix86_autovectorize_vector_sizes (void)
> #undef TARGET_BUILD_BUILTIN_VA_LIST
> #define TARGET_BUILD_BUILTIN_VA_LIST ix86_build_builtin_va_list
>
> +#undef TARGET_FOLD_BUILTIN
> +#define TARGET_FOLD_BUILTIN ix86_fold_builtin
> +
> #undef TARGET_ENUM_VA_LIST_P
> #define TARGET_ENUM_VA_LIST_P ix86_enum_va_list
> Index: gcc/testsuite/gcc.target/i386/builtin_target.c
> ===================================================================
> --- gcc/testsuite/gcc.target/i386/builtin_target.c (revision 0)
> +++ gcc/testsuite/gcc.target/i386/builtin_target.c (revision 0)
> @@ -0,0 +1,61 @@
> +/* This test checks if the __builtin_cpu_* calls are recognized. */
> +
> +/* { dg-do run } */
> +
> +int
> +fn1 ()
> +{
> + if (__builtin_cpu_supports_cmov () < 0)
> + return -1;
> + if (__builtin_cpu_supports_mmx () < 0)
> + return -1;
> + if (__builtin_cpu_supports_popcount () < 0)
> + return -1;
> + if (__builtin_cpu_supports_sse () < 0)
> + return -1;
> + if (__builtin_cpu_supports_sse2 () < 0)
> + return -1;
> + if (__builtin_cpu_supports_sse3 () < 0)
> + return -1;
> + if (__builtin_cpu_supports_ssse3 () < 0)
> + return -1;
> + if (__builtin_cpu_supports_sse4_1 () < 0)
> + return -1;
> + if (__builtin_cpu_supports_sse4_2 () < 0)
> + return -1;
> + if (__builtin_cpu_is_amd () < 0)
> + return -1;
> + if (__builtin_cpu_is_intel () < 0)
> + return -1;
> + if (__builtin_cpu_is_intel_atom () < 0)
> + return -1;
> + if (__builtin_cpu_is_intel_core2 () < 0)
> + return -1;
> + if (__builtin_cpu_is_intel_corei7 () < 0)
> + return -1;
> + if (__builtin_cpu_is_intel_corei7_nehalem () < 0)
> + return -1;
> + if (__builtin_cpu_is_intel_corei7_westmere () < 0)
> + return -1;
> + if (__builtin_cpu_is_intel_corei7_sandybridge () < 0)
> + return -1;
> + if (__builtin_cpu_is_amdfam10 () < 0)
> + return -1;
> + if (__builtin_cpu_is_amdfam10_barcelona () < 0)
> + return -1;
> + if (__builtin_cpu_is_amdfam10_shanghai () < 0)
> + return -1;
> + if (__builtin_cpu_is_amdfam10_istanbul () < 0)
> + return -1;
> + if (__builtin_cpu_is_amdfam15_bdver1 () < 0)
> + return -1;
> + if (__builtin_cpu_is_amdfam15_bdver2 () < 0)
> + return -1;
> +
> + return 0;
> +}
> +
> +int main ()
> +{
> + return fn1 ();
> +}
>
> --
> This patch is available for review at http://codereview.appspot.com/5754058
Message from davidxl@google.com
2012-03-08T20:35:23+00:00davidxlurn:md5:93abb263b162a081ef2b93f8977a875f
On Wed, Mar 7, 2012 at 5:51 AM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Wed, Mar 7, 2012 at 1:49 AM, Sriraman Tallam <tmsriram@google.com> wrote:
>> Patch for CPU detection at run-time.
>> ===================================
>>
>> Patch for CPU detection at run-time, to be used in dispatching of
>> multi-versioned functions. Please see this discussion:
>> http://gcc.gnu.org/ml/gcc-patches/2011-08/msg01355.html
>> when this patch for reviewed the last time.
>>
>> For more detailed description:
>> http://gcc.gnu.org/ml/gcc/2012-03/msg00074.html
>>
>> One of the main concerns was about making CPU detection initialization a
>> constructor. The main point raised was about constructor ordering. I have
>> added a priority value to the CPU detection constructor to make it very high
>> priority so that it is guaranteed to fire before every constructor without
>> an explicitly marked priority value of 101. However, IFUNC initializers
>> will still fire before this constructor, so the cpu initialization routine
>> has to be explicitly called in such initializers for which I have added a
>> builtin: __builtin_cpu_init ().
>>
>> This patch adds the following new builtins:
>>
>> * __builtin_cpu_init
>> * __builtin_cpu_supports_cmov
>> * __builtin_cpu_supports_mmx
>> * __builtin_cpu_supports_popcount
>> * __builtin_cpu_supports_sse
>> * __builtin_cpu_supports_sse2
>> * __builtin_cpu_supports_sse3
>> * __builtin_cpu_supports_ssse3
>> * __builtin_cpu_supports_sse4_1
>> * __builtin_cpu_supports_sse4_2
>> * __builtin_cpu_is_amd
>> * __builtin_cpu_is_intel_atom
>> * __builtin_cpu_is_intel_core2
>> * __builtin_cpu_is_intel
>> * __builtin_cpu_is_intel_corei7
>> * __builtin_cpu_is_intel_corei7_nehalem
>> * __builtin_cpu_is_intel_corei7_westmere
>> * __builtin_cpu_is_intel_corei7_sandybridge
>> * __builtin_cpu_is_amdfam10
>> * __builtin_cpu_is_amdfam10_barcelona
>> * __builtin_cpu_is_amdfam10_shanghai
>> * __builtin_cpu_is_amdfam10_istanbul
>> * __builtin_cpu_is_amdfam15_bdver1
>> * __builtin_cpu_is_amdfam15_bdver2
>
> I think the non-feature detection functions are not necessary at all.
They are useful if compiler needs to do auto versioning based on cpu model.
> Builtin functions are not exactly cheap, nor is the scheme you invent
> backward/forward compatible. Instead, why not add a single builtin
> function, __builtin_cpu_supports(const char *), and decode from
> a comma-separated list of features? Unknown features are simply
> "not present". So I can write code with only a single configure check,
This is a good idea.
__builtin_is_cpu (const char* );
__builtin_cpu_supports (char char*);
thanks,
David
> for __builtin_cpu_supports, and cater for future features or older compilers.
>
> And of course that builtin would be even cross-platform.
>
> Implementation-wise I'll leave this to x86 maintainers to comment on.
>
> Richard.
>
>>
>> * config/i386/i386.c (build_struct_with_one_bit_fields): New function.
>> (make_var_decl): New function.
>> (get_field_from_struct): New function.
>> (fold_builtin_target): New function.
>> (ix86_fold_builtin): New function.
>> (ix86_expand_builtin): Expand new builtins by folding them.
>> (make_platform_builtin): New functions.
>> (ix86_init_platform_type_builtins): Make the new builtins.
>> (ix86_init_builtins): Make new builtins to detect CPU type.
>> (TARGET_FOLD_BUILTIN): New macro.
>> (IX86_BUILTIN_CPU_SUPPORTS_CMOV): New enum value.
>> (IX86_BUILTIN_CPU_SUPPORTS_MMX): New enum value.
>> (IX86_BUILTIN_CPU_SUPPORTS_POPCOUNT): New enum value.
>> (IX86_BUILTIN_CPU_SUPPORTS_SSE): New enum value.
>> (IX86_BUILTIN_CPU_SUPPORTS_SSE2): New enum value.
>> (IX86_BUILTIN_CPU_SUPPORTS_SSE3): New enum value.
>> (IX86_BUILTIN_CPU_SUPPORTS_SSSE3): New enum value.
>> (IX86_BUILTIN_CPU_SUPPORTS_SSE4_1): New enum value.
>> (IX86_BUILTIN_CPU_SUPPORTS_SSE4_2): New enum value.
>> (IX86_BUILTIN_CPU_INIT): New enum value.
>> (IX86_BUILTIN_CPU_IS_AMD): New enum value.
>> (IX86_BUILTIN_CPU_IS_INTEL): New enum value.
>> (IX86_BUILTIN_CPU_IS_INTEL_ATOM): New enum value.
>> (IX86_BUILTIN_CPU_IS_INTEL_CORE2): New enum value.
>> (IX86_BUILTIN_CPU_IS_INTEL_COREI7_NEHALEM): New enum value.
>> (IX86_BUILTIN_CPU_IS_INTEL_COREI7_WESTMERE): New enum value.
>> (IX86_BUILTIN_CPU_IS_INTEL_COREI7_SANDYBRIDGE): New enum value.
>> (IX86_BUILTIN_CPU_IS_AMDFAM10_BARCELONA): New enum value.
>> (IX86_BUILTIN_CPU_IS_AMDFAM10_SHANGHAI): New enum value.
>> (IX86_BUILTIN_CPU_IS_AMDFAM10_ISTANBUL): New enum value.
>> (IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER1): New enum value.
>> (IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER2): New enum value.
>> * config/i386/i386-builtin-types.def: New function type.
>> * testsuite/gcc.target/builtin_target.c: New testcase.
>>
>> * libgcc/config/i386/i386-cpuinfo.c: New file.
>> * libgcc/config/i386/t-cpuinfo: New file.
>> * libgcc/config.host: Include t-cpuinfo.
>> * libgcc/config/i386/libgcc-glibc.ver: Version symbols __cpu_model
>> and __cpu_features.
>>
>> Index: libgcc/config.host
>> ===================================================================
>> --- libgcc/config.host (revision 184971)
>> +++ libgcc/config.host (working copy)
>> @@ -1142,7 +1142,7 @@ i[34567]86-*-linux* | x86_64-*-linux* | \
>> i[34567]86-*-kfreebsd*-gnu | x86_64-*-kfreebsd*-gnu | \
>> i[34567]86-*-knetbsd*-gnu | \
>> i[34567]86-*-gnu*)
>> - tmake_file="${tmake_file} t-tls i386/t-linux"
>> + tmake_file="${tmake_file} t-tls i386/t-linux i386/t-cpuinfo"
>> if test "$libgcc_cv_cfi" = "yes"; then
>> tmake_file="${tmake_file} t-stack i386/t-stack-i386"
>> fi
>> Index: libgcc/config/i386/t-cpuinfo
>> ===================================================================
>> --- libgcc/config/i386/t-cpuinfo (revision 0)
>> +++ libgcc/config/i386/t-cpuinfo (revision 0)
>> @@ -0,0 +1 @@
>> +LIB2ADD += $(srcdir)/config/i386/i386-cpuinfo.c
>> Index: libgcc/config/i386/i386-cpuinfo.c
>> ===================================================================
>> --- libgcc/config/i386/i386-cpuinfo.c (revision 0)
>> +++ libgcc/config/i386/i386-cpuinfo.c (revision 0)
>> @@ -0,0 +1,306 @@
>> +/* Get CPU type and Features for x86 processors.
>> + Copyright (C) 2011 Free Software Foundation, Inc.
>> + Contributed by Sriraman Tallam (tmsriram@google.com)
>> +
>> +This file is part of GCC.
>> +
>> +GCC is free software; you can redistribute it and/or modify it under
>> +the terms of the GNU General Public License as published by the Free
>> +Software Foundation; either version 3, or (at your option) any later
>> +version.
>> +
>> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
>> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
>> +for more details.
>> +
>> +You should have received a copy of the GNU General Public License
>> +along with GCC; see the file COPYING3. If not see
>> +<http://www.gnu.org/licenses/>. */
>> +
>> +#include "cpuid.h"
>> +#include "tsystem.h"
>> +
>> +int __cpu_indicator_init (void) __attribute__ ((constructor (101)));
>> +
>> +enum vendor_signatures
>> +{
>> + SIG_INTEL = 0x756e6547 /* Genu */,
>> + SIG_AMD = 0x68747541 /* Auth */
>> +};
>> +
>> +/* ISA Features supported. */
>> +
>> +struct __processor_features
>> +{
>> + unsigned int __cpu_cmov : 1;
>> + unsigned int __cpu_mmx : 1;
>> + unsigned int __cpu_popcnt : 1;
>> + unsigned int __cpu_sse : 1;
>> + unsigned int __cpu_sse2 : 1;
>> + unsigned int __cpu_sse3 : 1;
>> + unsigned int __cpu_ssse3 : 1;
>> + unsigned int __cpu_sse4_1 : 1;
>> + unsigned int __cpu_sse4_2 : 1;
>> +} __cpu_features;
>> +
>> +/* Processor Model. */
>> +
>> +struct __processor_model
>> +{
>> + /* Vendor. */
>> + unsigned int __cpu_is_amd : 1;
>> + unsigned int __cpu_is_intel : 1;
>> + /* CPU type. */
>> + unsigned int __cpu_is_intel_atom : 1;
>> + unsigned int __cpu_is_intel_core2 : 1;
>> + unsigned int __cpu_is_intel_corei7 : 1;
>> + unsigned int __cpu_is_intel_corei7_nehalem : 1;
>> + unsigned int __cpu_is_intel_corei7_westmere : 1;
>> + unsigned int __cpu_is_intel_corei7_sandybridge : 1;
>> + unsigned int __cpu_is_amdfam10h : 1;
>> + unsigned int __cpu_is_amdfam10h_barcelona : 1;
>> + unsigned int __cpu_is_amdfam10h_shanghai : 1;
>> + unsigned int __cpu_is_amdfam10h_istanbul : 1;
>> + unsigned int __cpu_is_amdfam15h_bdver1 : 1;
>> + unsigned int __cpu_is_amdfam15h_bdver2 : 1;
>> +} __cpu_model;
>> +
>> +/* Get the specific type of AMD CPU. */
>> +
>> +static void
>> +get_amd_cpu (unsigned int family, unsigned int model)
>> +{
>> + switch (family)
>> + {
>> + /* AMD Family 10h. */
>> + case 0x10:
>> + switch (model)
>> + {
>> + case 0x2:
>> + /* Barcelona. */
>> + __cpu_model.__cpu_is_amdfam10h = 1;
>> + __cpu_model.__cpu_is_amdfam10h_barcelona = 1;
>> + break;
>> + case 0x4:
>> + /* Shanghai. */
>> + __cpu_model.__cpu_is_amdfam10h = 1;
>> + __cpu_model.__cpu_is_amdfam10h_shanghai = 1;
>> + break;
>> + case 0x8:
>> + /* Istanbul. */
>> + __cpu_model.__cpu_is_amdfam10h = 1;
>> + __cpu_model.__cpu_is_amdfam10h_istanbul = 1;
>> + break;
>> + default:
>> + break;
>> + }
>> + break;
>> + /* AMD Family 15h. */
>> + case 0x15:
>> + /* Bulldozer version 1. */
>> + if (model >= 0 && model <= 0xf)
>> + __cpu_model.__cpu_is_amdfam15h_bdver1 = 1;
>> + /* Bulldozer version 2. */
>> + if (model >= 0x10 && model <= 0x1f)
>> + __cpu_model.__cpu_is_amdfam15h_bdver2 = 1;
>> + break;
>> + default:
>> + break;
>> + }
>> +}
>> +
>> +/* Get the specific type of Intel CPU. */
>> +
>> +static void
>> +get_intel_cpu (unsigned int family, unsigned int model, unsigned int brand_id)
>> +{
>> + /* Parse family and model only if brand ID is 0. */
>> + if (brand_id == 0)
>> + {
>> + switch (family)
>> + {
>> + case 0x5:
>> + /* Pentium. */
>> + break;
>> + case 0x6:
>> + switch (model)
>> + {
>> + case 0x1c:
>> + case 0x26:
>> + /* Atom. */
>> + __cpu_model.__cpu_is_intel_atom = 1;
>> + break;
>> + case 0x1a:
>> + case 0x1e:
>> + case 0x1f:
>> + case 0x2e:
>> + /* Nehalem. */
>> + __cpu_model.__cpu_is_intel_corei7 = 1;
>> + __cpu_model.__cpu_is_intel_corei7_nehalem = 1;
>> + break;
>> + case 0x25:
>> + case 0x2c:
>> + case 0x2f:
>> + /* Westmere. */
>> + __cpu_model.__cpu_is_intel_corei7 = 1;
>> + __cpu_model.__cpu_is_intel_corei7_westmere = 1;
>> + break;
>> + case 0x2a:
>> + /* Sandy Bridge. */
>> + __cpu_model.__cpu_is_intel_corei7 = 1;
>> + __cpu_model.__cpu_is_intel_corei7_sandybridge = 1;
>> + break;
>> + case 0x17:
>> + case 0x1d:
>> + /* Penryn. */
>> + case 0x0f:
>> + /* Merom. */
>> + __cpu_model.__cpu_is_intel_core2 = 1;
>> + break;
>> + default:
>> + break;
>> + }
>> + break;
>> + default:
>> + /* We have no idea. */
>> + break;
>> + }
>> + }
>> +}
>> +
>> +static void
>> +get_available_features (unsigned int ecx, unsigned int edx)
>> +{
>> + __cpu_features.__cpu_cmov = (edx & bit_CMOV) ? 1 : 0;
>> + __cpu_features.__cpu_mmx = (edx & bit_MMX) ? 1 : 0;
>> + __cpu_features.__cpu_sse = (edx & bit_SSE) ? 1 : 0;
>> + __cpu_features.__cpu_sse2 = (edx & bit_SSE2) ? 1 : 0;
>> + __cpu_features.__cpu_popcnt = (ecx & bit_POPCNT) ? 1 : 0;
>> + __cpu_features.__cpu_sse3 = (ecx & bit_SSE3) ? 1 : 0;
>> + __cpu_features.__cpu_ssse3 = (ecx & bit_SSSE3) ? 1 : 0;
>> + __cpu_features.__cpu_sse4_1 = (ecx & bit_SSE4_1) ? 1 : 0;
>> + __cpu_features.__cpu_sse4_2 = (ecx & bit_SSE4_2) ? 1 : 0;
>> +}
>> +
>> +
>> +/* Sanity check for the vendor and cpu type flags. */
>> +
>> +static int
>> +sanity_check (void)
>> +{
>> + unsigned int one_type = 0;
>> +
>> + /* Vendor cannot be Intel and AMD. */
>> + gcc_assert((__cpu_model.__cpu_is_intel == 0)
>> + || (__cpu_model.__cpu_is_amd == 0));
>> +
>> + /* Only one CPU type can be set. */
>> + one_type = (__cpu_model.__cpu_is_intel_atom
>> + + __cpu_model.__cpu_is_intel_core2
>> + + __cpu_model.__cpu_is_intel_corei7_nehalem
>> + + __cpu_model.__cpu_is_intel_corei7_westmere
>> + + __cpu_model.__cpu_is_intel_corei7_sandybridge
>> + + __cpu_model.__cpu_is_amdfam10h_barcelona
>> + + __cpu_model.__cpu_is_amdfam10h_shanghai
>> + + __cpu_model.__cpu_is_amdfam10h_istanbul
>> + + __cpu_model.__cpu_is_amdfam15h_bdver1
>> + + __cpu_model.__cpu_is_amdfam15h_bdver2);
>> +
>> + gcc_assert (one_type <= 1);
>> + return 0;
>> +}
>> +
>> +/* A noinline function calling __get_cpuid. Having many calls to
>> + cpuid in one function in 32-bit mode causes GCC to complain:
>> + "can’t find a register in class ‘CLOBBERED_REGS’". This is
>> + related to PR rtl-optimization 44174. */
>> +
>> +static int __attribute__ ((noinline))
>> +__get_cpuid_output (unsigned int __level,
>> + unsigned int *__eax, unsigned int *__ebx,
>> + unsigned int *__ecx, unsigned int *__edx)
>> +{
>> + return __get_cpuid (__level, __eax, __ebx, __ecx, __edx);
>> +}
>> +
>> +
>> +/* A constructor function that is sets __cpu_model and __cpu_features with
>> + the right values. This needs to run only once. This constructor is
>> + given the highest priority and it should run before constructors without
>> + the priority set. However, it still runs after ifunc initializers and
>> + needs to be called explicitly there. */
>> +
>> +int __attribute__ ((constructor (101)))
>> +__cpu_indicator_init (void)
>> +{
>> + unsigned int eax, ebx, ecx, edx;
>> +
>> + int max_level = 5;
>> + unsigned int vendor;
>> + unsigned int model, family, brand_id;
>> + unsigned int extended_model, extended_family;
>> + static int called = 0;
>> +
>> + /* This function needs to run just once. */
>> + if (called)
>> + return 0;
>> + else
>> + called = 1;
>> +
>> + /* Assume cpuid insn present. Run in level 0 to get vendor id. */
>> + if (!__get_cpuid_output (0, &eax, &ebx, &ecx, &edx))
>> + return -1;
>> +
>> + vendor = ebx;
>> + max_level = eax;
>> +
>> + if (max_level < 1)
>> + return -1;
>> +
>> + if (!__get_cpuid_output (1, &eax, &ebx, &ecx, &edx))
>> + return -1;
>> +
>> + model = (eax >> 4) & 0x0f;
>> + family = (eax >> 8) & 0x0f;
>> + brand_id = ebx & 0xff;
>> + extended_model = (eax >> 12) & 0xf0;
>> + extended_family = (eax >> 20) & 0xff;
>> +
>> + if (vendor == SIG_INTEL)
>> + {
>> + /* Adjust model and family for Intel CPUS. */
>> + if (family == 0x0f)
>> + {
>> + family += extended_family;
>> + model += extended_model;
>> + }
>> + else if (family == 0x06)
>> + model += extended_model;
>> +
>> + /* Get CPU type. */
>> + __cpu_model.__cpu_is_intel = 1;
>> + get_intel_cpu (family, model, brand_id);
>> + }
>> +
>> + if (vendor == SIG_AMD)
>> + {
>> + /* Adjust model and family for AMD CPUS. */
>> + if (family == 0x0f)
>> + {
>> + family += extended_family;
>> + model += (extended_model << 4);
>> + }
>> +
>> + /* Get CPU type. */
>> + __cpu_model.__cpu_is_amd = 1;
>> + get_amd_cpu (family, model);
>> + }
>> +
>> + /* Find available features. */
>> + get_available_features (ecx, edx);
>> +
>> + sanity_check ();
>> +
>> + return 0;
>> +}
>> Index: libgcc/config/i386/libgcc-glibc.ver
>> ===================================================================
>> --- libgcc/config/i386/libgcc-glibc.ver (revision 184971)
>> +++ libgcc/config/i386/libgcc-glibc.ver (working copy)
>> @@ -147,6 +147,11 @@ GCC_4.3.0 {
>> __trunctfxf2
>> __unordtf2
>> }
>> +
>> +GCC_4.8.0 {
>> + __cpu_model
>> + __cpu_features
>> +}
>> %else
>> GCC_4.4.0 {
>> __addtf3
>> @@ -183,4 +188,9 @@ GCC_4.4.0 {
>> GCC_4.5.0 {
>> __extendxftf2
>> }
>> +
>> +GCC_4.8.0 {
>> + __cpu_model
>> + __cpu_features
>> +}
>> %endif
>> Index: gcc/config/i386/i386-builtin-types.def
>> ===================================================================
>> --- gcc/config/i386/i386-builtin-types.def (revision 184971)
>> +++ gcc/config/i386/i386-builtin-types.def (working copy)
>> @@ -143,6 +143,7 @@ DEF_FUNCTION_TYPE (UINT64)
>> DEF_FUNCTION_TYPE (UNSIGNED)
>> DEF_FUNCTION_TYPE (VOID)
>> DEF_FUNCTION_TYPE (PVOID)
>> +DEF_FUNCTION_TYPE (INT)
>>
>> DEF_FUNCTION_TYPE (FLOAT, FLOAT)
>> DEF_FUNCTION_TYPE (FLOAT128, FLOAT128)
>> Index: gcc/config/i386/i386.c
>> ===================================================================
>> --- gcc/config/i386/i386.c (revision 184971)
>> +++ gcc/config/i386/i386.c (working copy)
>> @@ -25637,6 +25637,33 @@ enum ix86_builtins
>> /* CFString built-in for darwin */
>> IX86_BUILTIN_CFSTRING,
>>
>> + /* Builtins to get CPU features. */
>> + IX86_BUILTIN_CPU_SUPPORTS_CMOV,
>> + IX86_BUILTIN_CPU_SUPPORTS_MMX,
>> + IX86_BUILTIN_CPU_SUPPORTS_POPCOUNT,
>> + IX86_BUILTIN_CPU_SUPPORTS_SSE,
>> + IX86_BUILTIN_CPU_SUPPORTS_SSE2,
>> + IX86_BUILTIN_CPU_SUPPORTS_SSE3,
>> + IX86_BUILTIN_CPU_SUPPORTS_SSSE3,
>> + IX86_BUILTIN_CPU_SUPPORTS_SSE4_1,
>> + IX86_BUILTIN_CPU_SUPPORTS_SSE4_2,
>> + /* Builtins to get CPU type. */
>> + IX86_BUILTIN_CPU_INIT,
>> + IX86_BUILTIN_CPU_IS_AMD,
>> + IX86_BUILTIN_CPU_IS_INTEL,
>> + IX86_BUILTIN_CPU_IS_INTEL_ATOM,
>> + IX86_BUILTIN_CPU_IS_INTEL_CORE2,
>> + IX86_BUILTIN_CPU_IS_INTEL_COREI7,
>> + IX86_BUILTIN_CPU_IS_INTEL_COREI7_NEHALEM,
>> + IX86_BUILTIN_CPU_IS_INTEL_COREI7_WESTMERE,
>> + IX86_BUILTIN_CPU_IS_INTEL_COREI7_SANDYBRIDGE,
>> + IX86_BUILTIN_CPU_IS_AMDFAM10H,
>> + IX86_BUILTIN_CPU_IS_AMDFAM10H_BARCELONA,
>> + IX86_BUILTIN_CPU_IS_AMDFAM10H_SHANGHAI,
>> + IX86_BUILTIN_CPU_IS_AMDFAM10H_ISTANBUL,
>> + IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER1,
>> + IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER2,
>> +
>> IX86_BUILTIN_MAX
>> };
>>
>> @@ -27446,6 +27473,593 @@ ix86_init_mmx_sse_builtins (void)
>> }
>> }
>>
>> +/* Returns a struct type with name NAME and number of fields equal to
>> + NUM_FIELDS. Each field is a unsigned int bit field of length 1 bit. */
>> +
>> +static tree
>> +build_struct_with_one_bit_fields (int num_fields, const char *name)
>> +{
>> + int i;
>> + char field_name [10];
>> + tree field = NULL_TREE, field_chain = NULL_TREE;
>> + tree type = make_node (RECORD_TYPE);
>> +
>> + strcpy (field_name, "k_field");
>> +
>> + for (i = 0; i < num_fields; i++)
>> + {
>> + /* Name the fields, 0_field, 1_field, ... */
>> + field_name [0] = '0' + i;
>> + field = build_decl (UNKNOWN_LOCATION, FIELD_DECL,
>> + get_identifier (field_name), unsigned_type_node);
>> + DECL_BIT_FIELD (field) = 1;
>> + DECL_SIZE (field) = bitsize_one_node;
>> + if (field_chain != NULL_TREE)
>> + DECL_CHAIN (field) = field_chain;
>> + field_chain = field;
>> + }
>> + finish_builtin_struct (type, name, field_chain, NULL_TREE);
>> + return type;
>> +}
>> +
>> +/* Returns a extern, comdat VAR_DECL of type TYPE and name NAME. */
>> +
>> +static tree
>> +make_var_decl (tree type, const char *name)
>> +{
>> + tree new_decl;
>> + struct varpool_node *vnode;
>> +
>> + new_decl = build_decl (UNKNOWN_LOCATION,
>> + VAR_DECL,
>> + get_identifier(name),
>> + type);
>> +
>> + DECL_EXTERNAL (new_decl) = 1;
>> + TREE_STATIC (new_decl) = 1;
>> + TREE_PUBLIC (new_decl) = 1;
>> + DECL_INITIAL (new_decl) = 0;
>> + DECL_ARTIFICIAL (new_decl) = 0;
>> + DECL_PRESERVE_P (new_decl) = 1;
>> +
>> + make_decl_one_only (new_decl, DECL_ASSEMBLER_NAME (new_decl));
>> + assemble_variable (new_decl, 0, 0, 0);
>> +
>> + vnode = varpool_node (new_decl);
>> + gcc_assert (vnode != NULL);
>> + /* Set finalized to 1, otherwise it asserts in function "write_symbol" in
>> + lto-streamer-out.c. */
>> + vnode->finalized = 1;
>> +
>> + return new_decl;
>> +}
>> +
>> +/* Traverses the chain of fields in STRUCT_TYPE and returns the FIELD_NUM
>> + numbered field. */
>> +
>> +static tree
>> +get_field_from_struct (tree struct_type, int field_num)
>> +{
>> + int i;
>> + tree field = TYPE_FIELDS (struct_type);
>> +
>> + for (i = 0; i < field_num; i++, field = DECL_CHAIN(field))
>> + {
>> + gcc_assert (field != NULL_TREE);
>> + }
>> +
>> + return field;
>> +}
>> +
>> +/* FNDECL is a __builtin_cpu_* call that is folded into an integer defined
>> + in libgcc/config/i386/i386-cpuinfo.c */
>> +
>> +static tree
>> +fold_builtin_cpu (enum ix86_builtins fn_code)
>> +{
>> + /* This is the order of bit-fields in __processor_features in
>> + i386-cpuinfo.c */
>> + enum processor_features
>> + {
>> + F_CMOV = 0,
>> + F_MMX,
>> + F_POPCNT,
>> + F_SSE,
>> + F_SSE2,
>> + F_SSE3,
>> + F_SSSE3,
>> + F_SSE4_1,
>> + F_SSE4_2,
>> + F_MAX
>> + };
>> +
>> + /* This is the order of bit-fields in __processor_model in
>> + i386-cpuinfo.c */
>> + enum processor_model
>> + {
>> + M_AMD = 0,
>> + M_INTEL,
>> + M_INTEL_ATOM,
>> + M_INTEL_CORE2,
>> + M_INTEL_COREI7,
>> + M_INTEL_COREI7_NEHALEM,
>> + M_INTEL_COREI7_WESTMERE,
>> + M_INTEL_COREI7_SANDYBRIDGE,
>> + M_AMDFAM10H,
>> + M_AMDFAM10H_BARCELONA,
>> + M_AMDFAM10H_SHANGHAI,
>> + M_AMDFAM10H_ISTANBUL,
>> + M_AMDFAM15H_BDVER1,
>> + M_AMDFAM15H_BDVER2,
>> + M_MAX
>> + };
>> +
>> + static tree __processor_features_type = NULL_TREE;
>> + static tree __cpu_features_var = NULL_TREE;
>> + static tree __processor_model_type = NULL_TREE;
>> + static tree __cpu_model_var = NULL_TREE;
>> + static tree field;
>> + static tree which_struct;
>> +
>> + if (__processor_features_type == NULL_TREE)
>> + __processor_features_type = build_struct_with_one_bit_fields (F_MAX,
>> + "__processor_features");
>> +
>> + if (__processor_model_type == NULL_TREE)
>> + __processor_model_type = build_struct_with_one_bit_fields (M_MAX,
>> + "__processor_model");
>> +
>> + if (__cpu_features_var == NULL_TREE)
>> + __cpu_features_var = make_var_decl (__processor_features_type,
>> + "__cpu_features");
>> +
>> + if (__cpu_model_var == NULL_TREE)
>> + __cpu_model_var = make_var_decl (__processor_model_type,
>> + "__cpu_model");
>> +
>> + /* Look at the code to identify the field requested. */
>> + switch (fn_code)
>> + {
>> + case IX86_BUILTIN_CPU_SUPPORTS_CMOV:
>> + field = get_field_from_struct (__processor_features_type, F_CMOV);
>> + which_struct = __cpu_features_var;
>> + break;
>> + case IX86_BUILTIN_CPU_SUPPORTS_MMX:
>> + field = get_field_from_struct (__processor_features_type, F_MMX);
>> + which_struct = __cpu_features_var;
>> + break;
>> + case IX86_BUILTIN_CPU_SUPPORTS_POPCOUNT:
>> + field = get_field_from_struct (__processor_features_type, F_POPCNT);
>> + which_struct = __cpu_features_var;
>> + break;
>> + case IX86_BUILTIN_CPU_SUPPORTS_SSE:
>> + field = get_field_from_struct (__processor_features_type, F_SSE);
>> + which_struct = __cpu_features_var;
>> + break;
>> + case IX86_BUILTIN_CPU_SUPPORTS_SSE2:
>> + field = get_field_from_struct (__processor_features_type, F_SSE2);
>> + which_struct = __cpu_features_var;
>> + break;
>> + case IX86_BUILTIN_CPU_SUPPORTS_SSE3:
>> + field = get_field_from_struct (__processor_features_type, F_SSE3);
>> + which_struct = __cpu_features_var;
>> + break;
>> + case IX86_BUILTIN_CPU_SUPPORTS_SSSE3:
>> + field = get_field_from_struct (__processor_features_type, F_SSSE3);
>> + which_struct = __cpu_features_var;
>> + break;
>> + case IX86_BUILTIN_CPU_SUPPORTS_SSE4_1:
>> + field = get_field_from_struct (__processor_features_type, F_SSE4_1);
>> + which_struct = __cpu_features_var;
>> + break;
>> + case IX86_BUILTIN_CPU_SUPPORTS_SSE4_2:
>> + field = get_field_from_struct (__processor_features_type, F_SSE4_2);
>> + which_struct = __cpu_features_var;
>> + break;
>> + case IX86_BUILTIN_CPU_IS_AMD:
>> + field = get_field_from_struct (__processor_model_type, M_AMD);
>> + which_struct = __cpu_model_var;
>> + break;
>> + case IX86_BUILTIN_CPU_IS_INTEL:
>> + field = get_field_from_struct (__processor_model_type, M_INTEL);
>> + which_struct = __cpu_model_var;
>> + break;
>> + case IX86_BUILTIN_CPU_IS_INTEL_ATOM:
>> + field = get_field_from_struct (__processor_model_type, M_INTEL_ATOM);
>> + which_struct = __cpu_model_var;
>> + break;
>> + case IX86_BUILTIN_CPU_IS_INTEL_CORE2:
>> + field = get_field_from_struct (__processor_model_type, M_INTEL_CORE2);
>> + which_struct = __cpu_model_var;
>> + break;
>> + case IX86_BUILTIN_CPU_IS_INTEL_COREI7:
>> + field = get_field_from_struct (__processor_model_type,
>> + M_INTEL_COREI7);
>> + which_struct = __cpu_model_var;
>> + break;
>> + case IX86_BUILTIN_CPU_IS_INTEL_COREI7_NEHALEM:
>> + field = get_field_from_struct (__processor_model_type,
>> + M_INTEL_COREI7_NEHALEM);
>> + which_struct = __cpu_model_var;
>> + break;
>> + case IX86_BUILTIN_CPU_IS_INTEL_COREI7_WESTMERE:
>> + field = get_field_from_struct (__processor_model_type,
>> + M_INTEL_COREI7_WESTMERE);
>> + which_struct = __cpu_model_var;
>> + break;
>> + case IX86_BUILTIN_CPU_IS_INTEL_COREI7_SANDYBRIDGE:
>> + field = get_field_from_struct (__processor_model_type,
>> + M_INTEL_COREI7_SANDYBRIDGE);
>> + which_struct = __cpu_model_var;
>> + break;
>> + case IX86_BUILTIN_CPU_IS_AMDFAM10H:
>> + field = get_field_from_struct (__processor_model_type,
>> + M_AMDFAM10H);
>> + which_struct = __cpu_model_var;
>> + break;
>> + case IX86_BUILTIN_CPU_IS_AMDFAM10H_BARCELONA:
>> + field = get_field_from_struct (__processor_model_type,
>> + M_AMDFAM10H_BARCELONA);
>> + which_struct = __cpu_model_var;
>> + break;
>> + case IX86_BUILTIN_CPU_IS_AMDFAM10H_SHANGHAI:
>> + field = get_field_from_struct (__processor_model_type,
>> + M_AMDFAM10H_SHANGHAI);
>> + which_struct = __cpu_model_var;
>> + break;
>> + case IX86_BUILTIN_CPU_IS_AMDFAM10H_ISTANBUL:
>> + field = get_field_from_struct (__processor_model_type,
>> + M_AMDFAM10H_ISTANBUL);
>> + which_struct = __cpu_model_var;
>> + break;
>> + case IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER1:
>> + field = get_field_from_struct (__processor_model_type,
>> + M_AMDFAM15H_BDVER1);
>> + which_struct = __cpu_model_var;
>> + break;
>> + case IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER2:
>> + field = get_field_from_struct (__processor_model_type,
>> + M_AMDFAM15H_BDVER2);
>> + which_struct = __cpu_model_var;
>> + break;
>> + default:
>> + return NULL_TREE;
>> + }
>> +
>> + return build3 (COMPONENT_REF, TREE_TYPE (field), which_struct, field, NULL_TREE);
>> +}
>> +
>> +static tree
>> +ix86_fold_builtin (tree fndecl, int n_args ATTRIBUTE_UNUSED,
>> + tree *args ATTRIBUTE_UNUSED, bool ignore ATTRIBUTE_UNUSED)
>> +{
>> + const char* decl_name = IDENTIFIER_POINTER (DECL_NAME (fndecl));
>> + if (DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_MD
>> + && strstr(decl_name, "__builtin_cpu") != NULL)
>> + {
>> + enum ix86_builtins code = (enum ix86_builtins)
>> + DECL_FUNCTION_CODE (fndecl);
>> + return fold_builtin_cpu (code);
>> + }
>> + return NULL_TREE;
>> +}
>> +
>> +/* A builtin to init/return the cpu type or feature. Returns an
>> + integer and the type is a const if IS_CONST is set. */
>> +
>> +static void
>> +make_platform_builtin (const char* name, int code, int is_const)
>> +{
>> + tree decl;
>> + tree type;
>> +
>> + type = ix86_get_builtin_func_type (INT_FTYPE_VOID);
>> + decl = add_builtin_function (name, type, code, BUILT_IN_MD,
>> + NULL, NULL_TREE);
>> + gcc_assert (decl != NULL_TREE);
>> + ix86_builtins[(int) code] = decl;
>> + if (is_const)
>> + TREE_READONLY (decl) = 1;
>> +}
>> +
>> +/* Builtins to get CPU type and features supported. */
>> +
>> +static void
>> +ix86_init_platform_type_builtins (void)
>> +{
>> + make_platform_builtin ("__builtin_cpu_init",
>> + IX86_BUILTIN_CPU_INIT, 0);
>> + make_platform_builtin ("__builtin_cpu_supports_cmov",
>> + IX86_BUILTIN_CPU_SUPPORTS_CMOV, 1);
>> + make_platform_builtin ("__builtin_cpu_supports_mmx",
>> + IX86_BUILTIN_CPU_SUPPORTS_MMX, 1);
>> + make_platform_builtin ("__builtin_cpu_supports_popcount",
>> + IX86_BUILTIN_CPU_SUPPORTS_POPCOUNT, 1);
>> + make_platform_builtin ("__builtin_cpu_supports_sse",
>> + IX86_BUILTIN_CPU_SUPPORTS_SSE, 1);
>> + make_platform_builtin ("__builtin_cpu_supports_sse2",
>> + IX86_BUILTIN_CPU_SUPPORTS_SSE2, 1);
>> + make_platform_builtin ("__builtin_cpu_supports_sse3",
>> + IX86_BUILTIN_CPU_SUPPORTS_SSE3, 1);
>> + make_platform_builtin ("__builtin_cpu_supports_ssse3",
>> + IX86_BUILTIN_CPU_SUPPORTS_SSSE3, 1);
>> + make_platform_builtin ("__builtin_cpu_supports_sse4_1",
>> + IX86_BUILTIN_CPU_SUPPORTS_SSE4_1, 1);
>> + make_platform_builtin ("__builtin_cpu_supports_sse4_2",
>> + IX86_BUILTIN_CPU_SUPPORTS_SSE4_2, 1);
>> + make_platform_builtin ("__builtin_cpu_is_amd",
>> + IX86_BUILTIN_CPU_IS_AMD, 1);
>> + make_platform_builtin ("__builtin_cpu_is_intel_atom",
>> + IX86_BUILTIN_CPU_IS_INTEL_ATOM, 1);
>> + make_platform_builtin ("__builtin_cpu_is_intel_core2",
>> + IX86_BUILTIN_CPU_IS_INTEL_CORE2, 1);
>> + make_platform_builtin ("__builtin_cpu_is_intel",
>> + IX86_BUILTIN_CPU_IS_INTEL, 1);
>> + make_platform_builtin ("__builtin_cpu_is_intel_corei7",
>> + IX86_BUILTIN_CPU_IS_INTEL_COREI7, 1);
>> + make_platform_builtin ("__builtin_cpu_is_intel_corei7_nehalem",
>> + IX86_BUILTIN_CPU_IS_INTEL_COREI7_NEHALEM, 1);
>> + make_platform_builtin ("__builtin_cpu_is_intel_corei7_westmere",
>> + IX86_BUILTIN_CPU_IS_INTEL_COREI7_WESTMERE, 1);
>> + make_platform_builtin ("__builtin_cpu_is_intel_corei7_sandybridge",
>> + IX86_BUILTIN_CPU_IS_INTEL_COREI7_SANDYBRIDGE, 1);
>> + make_platform_builtin ("__builtin_cpu_is_amdfam10",
>> + IX86_BUILTIN_CPU_IS_AMDFAM10H, 1);
>> + make_platform_builtin ("__builtin_cpu_is_amdfam10_barcelona",
>> + IX86_BUILTIN_CPU_IS_AMDFAM10H_BARCELONA, 1);
>> + make_platform_builtin ("__builtin_cpu_is_amdfam10_shanghai",
>> + IX86_BUILTIN_CPU_IS_AMDFAM10H_SHANGHAI, 1);
>> + make_platform_builtin ("__builtin_cpu_is_amdfam10_istanbul",
>> + IX86_BUILTIN_CPU_IS_AMDFAM10H_ISTANBUL, 1);
>> + make_platform_builtin ("__builtin_cpu_is_amdfam15_bdver1",
>> + IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER1, 1);
>> + make_platform_builtin ("__builtin_cpu_is_amdfam15_bdver2",
>> + IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER2, 1);
>> +}
>> +
>> /* Internal method for ix86_init_builtins. */
>>
>> static void
>> @@ -27529,6 +28143,9 @@ ix86_init_builtins (void)
>>
>> ix86_init_builtin_types ();
>>
>> + /* Builtins to get CPU type and features. */
>> + ix86_init_platform_type_builtins ();
>> +
>> /* TFmode support builtins. */
>> def_builtin_const (0, "__builtin_infq",
>> FLOAT128_FTYPE_VOID, IX86_BUILTIN_INFQ);
>> @@ -29145,6 +29762,48 @@ ix86_expand_builtin (tree exp, rtx target, rtx sub
>> enum machine_mode mode0, mode1, mode2, mode3, mode4;
>> unsigned int fcode = DECL_FUNCTION_CODE (fndecl);
>>
>> + /* For CPU builtins that can be folded, fold first and expand the fold. */
>> + switch (fcode)
>> + {
>> + case IX86_BUILTIN_CPU_SUPPORTS_CMOV:
>> + case IX86_BUILTIN_CPU_SUPPORTS_MMX:
>> + case IX86_BUILTIN_CPU_SUPPORTS_POPCOUNT:
>> + case IX86_BUILTIN_CPU_SUPPORTS_SSE:
>> + case IX86_BUILTIN_CPU_SUPPORTS_SSE2:
>> + case IX86_BUILTIN_CPU_SUPPORTS_SSE3:
>> + case IX86_BUILTIN_CPU_SUPPORTS_SSSE3:
>> + case IX86_BUILTIN_CPU_SUPPORTS_SSE4_1:
>> + case IX86_BUILTIN_CPU_SUPPORTS_SSE4_2:
>> + case IX86_BUILTIN_CPU_IS_AMD:
>> + case IX86_BUILTIN_CPU_IS_INTEL:
>> + case IX86_BUILTIN_CPU_IS_INTEL_ATOM:
>> + case IX86_BUILTIN_CPU_IS_INTEL_CORE2:
>> + case IX86_BUILTIN_CPU_IS_INTEL_COREI7:
>> + case IX86_BUILTIN_CPU_IS_INTEL_COREI7_NEHALEM:
>> + case IX86_BUILTIN_CPU_IS_INTEL_COREI7_WESTMERE:
>> + case IX86_BUILTIN_CPU_IS_INTEL_COREI7_SANDYBRIDGE:
>> + case IX86_BUILTIN_CPU_IS_AMDFAM10H:
>> + case IX86_BUILTIN_CPU_IS_AMDFAM10H_BARCELONA:
>> + case IX86_BUILTIN_CPU_IS_AMDFAM10H_SHANGHAI:
>> + case IX86_BUILTIN_CPU_IS_AMDFAM10H_ISTANBUL:
>> + case IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER1:
>> + case IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER2:
>> + {
>> + tree fold_expr = fold_builtin_cpu ((enum ix86_builtins) fcode);
>> + gcc_assert (fold_expr != NULL_TREE);
>> + return expand_expr (fold_expr, target, mode, EXPAND_NORMAL);
>> + }
>> + case IX86_BUILTIN_CPU_INIT:
>> + {
>> + /* Make it call __cpu_indicator_init in libgcc. */
>> + tree call_expr, fndecl, type;
>> + type = build_function_type_list (integer_type_node, NULL_TREE);
>> + fndecl = build_fn_decl ("__cpu_indicator_init", type);
>> + call_expr = build_call_expr (fndecl, 0);
>> + return expand_expr (call_expr, target, mode, EXPAND_NORMAL);
>> + }
>> + }
>> +
>> /* Determine whether the builtin function is available under the current ISA.
>> Originally the builtin was not created if it wasn't applicable to the
>> current ISA based on the command line switches. With function specific
>> @@ -38610,6 +39269,12 @@ ix86_autovectorize_vector_sizes (void)
>> #undef TARGET_BUILD_BUILTIN_VA_LIST
>> #define TARGET_BUILD_BUILTIN_VA_LIST ix86_build_builtin_va_list
>>
>> +#undef TARGET_FOLD_BUILTIN
>> +#define TARGET_FOLD_BUILTIN ix86_fold_builtin
>> +
>> #undef TARGET_ENUM_VA_LIST_P
>> #define TARGET_ENUM_VA_LIST_P ix86_enum_va_list
>> Index: gcc/testsuite/gcc.target/i386/builtin_target.c
>> ===================================================================
>> --- gcc/testsuite/gcc.target/i386/builtin_target.c (revision 0)
>> +++ gcc/testsuite/gcc.target/i386/builtin_target.c (revision 0)
>> @@ -0,0 +1,61 @@
>> +/* This test checks if the __builtin_cpu_* calls are recognized. */
>> +
>> +/* { dg-do run } */
>> +
>> +int
>> +fn1 ()
>> +{
>> + if (__builtin_cpu_supports_cmov () < 0)
>> + return -1;
>> + if (__builtin_cpu_supports_mmx () < 0)
>> + return -1;
>> + if (__builtin_cpu_supports_popcount () < 0)
>> + return -1;
>> + if (__builtin_cpu_supports_sse () < 0)
>> + return -1;
>> + if (__builtin_cpu_supports_sse2 () < 0)
>> + return -1;
>> + if (__builtin_cpu_supports_sse3 () < 0)
>> + return -1;
>> + if (__builtin_cpu_supports_ssse3 () < 0)
>> + return -1;
>> + if (__builtin_cpu_supports_sse4_1 () < 0)
>> + return -1;
>> + if (__builtin_cpu_supports_sse4_2 () < 0)
>> + return -1;
>> + if (__builtin_cpu_is_amd () < 0)
>> + return -1;
>> + if (__builtin_cpu_is_intel () < 0)
>> + return -1;
>> + if (__builtin_cpu_is_intel_atom () < 0)
>> + return -1;
>> + if (__builtin_cpu_is_intel_core2 () < 0)
>> + return -1;
>> + if (__builtin_cpu_is_intel_corei7 () < 0)
>> + return -1;
>> + if (__builtin_cpu_is_intel_corei7_nehalem () < 0)
>> + return -1;
>> + if (__builtin_cpu_is_intel_corei7_westmere () < 0)
>> + return -1;
>> + if (__builtin_cpu_is_intel_corei7_sandybridge () < 0)
>> + return -1;
>> + if (__builtin_cpu_is_amdfam10 () < 0)
>> + return -1;
>> + if (__builtin_cpu_is_amdfam10_barcelona () < 0)
>> + return -1;
>> + if (__builtin_cpu_is_amdfam10_shanghai () < 0)
>> + return -1;
>> + if (__builtin_cpu_is_amdfam10_istanbul () < 0)
>> + return -1;
>> + if (__builtin_cpu_is_amdfam15_bdver1 () < 0)
>> + return -1;
>> + if (__builtin_cpu_is_amdfam15_bdver2 () < 0)
>> + return -1;
>> +
>> + return 0;
>> +}
>> +
>> +int main ()
>> +{
>> + return fn1 ();
>> +}
>>
>> --
>> This patch is available for review at http://codereview.appspot.com/5754058
Message from richard.guenther@gmail.com
2012-03-12T11:16:39+00:00richard.guenther_gmail.comurn:md5:1bf13c7fab4ccd553f0577978f5bfa97
On Thu, Mar 8, 2012 at 9:35 PM, Xinliang David Li <davidxl@google.com> wrote:
> On Wed, Mar 7, 2012 at 5:51 AM, Richard Guenther
> <richard.guenther@gmail.com> wrote:
>> On Wed, Mar 7, 2012 at 1:49 AM, Sriraman Tallam <tmsriram@google.com> wrote:
>>> Patch for CPU detection at run-time.
>>> ===================================
>>>
>>> Patch for CPU detection at run-time, to be used in dispatching of
>>> multi-versioned functions. Please see this discussion:
>>> http://gcc.gnu.org/ml/gcc-patches/2011-08/msg01355.html
>>> when this patch for reviewed the last time.
>>>
>>> For more detailed description:
>>> http://gcc.gnu.org/ml/gcc/2012-03/msg00074.html
>>>
>>> One of the main concerns was about making CPU detection initialization a
>>> constructor. The main point raised was about constructor ordering. I have
>>> added a priority value to the CPU detection constructor to make it very high
>>> priority so that it is guaranteed to fire before every constructor without
>>> an explicitly marked priority value of 101. However, IFUNC initializers
>>> will still fire before this constructor, so the cpu initialization routine
>>> has to be explicitly called in such initializers for which I have added a
>>> builtin: __builtin_cpu_init ().
>>>
>>> This patch adds the following new builtins:
>>>
>>> * __builtin_cpu_init
>>> * __builtin_cpu_supports_cmov
>>> * __builtin_cpu_supports_mmx
>>> * __builtin_cpu_supports_popcount
>>> * __builtin_cpu_supports_sse
>>> * __builtin_cpu_supports_sse2
>>> * __builtin_cpu_supports_sse3
>>> * __builtin_cpu_supports_ssse3
>>> * __builtin_cpu_supports_sse4_1
>>> * __builtin_cpu_supports_sse4_2
>>> * __builtin_cpu_is_amd
>>> * __builtin_cpu_is_intel_atom
>>> * __builtin_cpu_is_intel_core2
>>> * __builtin_cpu_is_intel
>>> * __builtin_cpu_is_intel_corei7
>>> * __builtin_cpu_is_intel_corei7_nehalem
>>> * __builtin_cpu_is_intel_corei7_westmere
>>> * __builtin_cpu_is_intel_corei7_sandybridge
>>> * __builtin_cpu_is_amdfam10
>>> * __builtin_cpu_is_amdfam10_barcelona
>>> * __builtin_cpu_is_amdfam10_shanghai
>>> * __builtin_cpu_is_amdfam10_istanbul
>>> * __builtin_cpu_is_amdfam15_bdver1
>>> * __builtin_cpu_is_amdfam15_bdver2
>>
>> I think the non-feature detection functions are not necessary at all.
>
> They are useful if compiler needs to do auto versioning based on cpu model.
>
>> Builtin functions are not exactly cheap, nor is the scheme you invent
>> backward/forward compatible. Instead, why not add a single builtin
>> function, __builtin_cpu_supports(const char *), and decode from
>> a comma-separated list of features? Unknown features are simply
>> "not present". So I can write code with only a single configure check,
>
> This is a good idea.
>
> __builtin_is_cpu (const char* );
> __builtin_cpu_supports (char char*);
That looks good to me.
Richard.
> thanks,
>
> David
>
>
>> for __builtin_cpu_supports, and cater for future features or older compilers.
>>
>> And of course that builtin would be even cross-platform.
>>
>> Implementation-wise I'll leave this to x86 maintainers to comment on.
>>
>> Richard.
>>
>>>
>>> * config/i386/i386.c (build_struct_with_one_bit_fields): New function.
>>> (make_var_decl): New function.
>>> (get_field_from_struct): New function.
>>> (fold_builtin_target): New function.
>>> (ix86_fold_builtin): New function.
>>> (ix86_expand_builtin): Expand new builtins by folding them.
>>> (make_platform_builtin): New functions.
>>> (ix86_init_platform_type_builtins): Make the new builtins.
>>> (ix86_init_builtins): Make new builtins to detect CPU type.
>>> (TARGET_FOLD_BUILTIN): New macro.
>>> (IX86_BUILTIN_CPU_SUPPORTS_CMOV): New enum value.
>>> (IX86_BUILTIN_CPU_SUPPORTS_MMX): New enum value.
>>> (IX86_BUILTIN_CPU_SUPPORTS_POPCOUNT): New enum value.
>>> (IX86_BUILTIN_CPU_SUPPORTS_SSE): New enum value.
>>> (IX86_BUILTIN_CPU_SUPPORTS_SSE2): New enum value.
>>> (IX86_BUILTIN_CPU_SUPPORTS_SSE3): New enum value.
>>> (IX86_BUILTIN_CPU_SUPPORTS_SSSE3): New enum value.
>>> (IX86_BUILTIN_CPU_SUPPORTS_SSE4_1): New enum value.
>>> (IX86_BUILTIN_CPU_SUPPORTS_SSE4_2): New enum value.
>>> (IX86_BUILTIN_CPU_INIT): New enum value.
>>> (IX86_BUILTIN_CPU_IS_AMD): New enum value.
>>> (IX86_BUILTIN_CPU_IS_INTEL): New enum value.
>>> (IX86_BUILTIN_CPU_IS_INTEL_ATOM): New enum value.
>>> (IX86_BUILTIN_CPU_IS_INTEL_CORE2): New enum value.
>>> (IX86_BUILTIN_CPU_IS_INTEL_COREI7_NEHALEM): New enum value.
>>> (IX86_BUILTIN_CPU_IS_INTEL_COREI7_WESTMERE): New enum value.
>>> (IX86_BUILTIN_CPU_IS_INTEL_COREI7_SANDYBRIDGE): New enum value.
>>> (IX86_BUILTIN_CPU_IS_AMDFAM10_BARCELONA): New enum value.
>>> (IX86_BUILTIN_CPU_IS_AMDFAM10_SHANGHAI): New enum value.
>>> (IX86_BUILTIN_CPU_IS_AMDFAM10_ISTANBUL): New enum value.
>>> (IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER1): New enum value.
>>> (IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER2): New enum value.
>>> * config/i386/i386-builtin-types.def: New function type.
>>> * testsuite/gcc.target/builtin_target.c: New testcase.
>>>
>>> * libgcc/config/i386/i386-cpuinfo.c: New file.
>>> * libgcc/config/i386/t-cpuinfo: New file.
>>> * libgcc/config.host: Include t-cpuinfo.
>>> * libgcc/config/i386/libgcc-glibc.ver: Version symbols __cpu_model
>>> and __cpu_features.
>>>
>>> Index: libgcc/config.host
>>> ===================================================================
>>> --- libgcc/config.host (revision 184971)
>>> +++ libgcc/config.host (working copy)
>>> @@ -1142,7 +1142,7 @@ i[34567]86-*-linux* | x86_64-*-linux* | \
>>> i[34567]86-*-kfreebsd*-gnu | x86_64-*-kfreebsd*-gnu | \
>>> i[34567]86-*-knetbsd*-gnu | \
>>> i[34567]86-*-gnu*)
>>> - tmake_file="${tmake_file} t-tls i386/t-linux"
>>> + tmake_file="${tmake_file} t-tls i386/t-linux i386/t-cpuinfo"
>>> if test "$libgcc_cv_cfi" = "yes"; then
>>> tmake_file="${tmake_file} t-stack i386/t-stack-i386"
>>> fi
>>> Index: libgcc/config/i386/t-cpuinfo
>>> ===================================================================
>>> --- libgcc/config/i386/t-cpuinfo (revision 0)
>>> +++ libgcc/config/i386/t-cpuinfo (revision 0)
>>> @@ -0,0 +1 @@
>>> +LIB2ADD += $(srcdir)/config/i386/i386-cpuinfo.c
>>> Index: libgcc/config/i386/i386-cpuinfo.c
>>> ===================================================================
>>> --- libgcc/config/i386/i386-cpuinfo.c (revision 0)
>>> +++ libgcc/config/i386/i386-cpuinfo.c (revision 0)
>>> @@ -0,0 +1,306 @@
>>> +/* Get CPU type and Features for x86 processors.
>>> + Copyright (C) 2011 Free Software Foundation, Inc.
>>> + Contributed by Sriraman Tallam (tmsriram@google.com)
>>> +
>>> +This file is part of GCC.
>>> +
>>> +GCC is free software; you can redistribute it and/or modify it under
>>> +the terms of the GNU General Public License as published by the Free
>>> +Software Foundation; either version 3, or (at your option) any later
>>> +version.
>>> +
>>> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
>>> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
>>> +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
>>> +for more details.
>>> +
>>> +You should have received a copy of the GNU General Public License
>>> +along with GCC; see the file COPYING3. If not see
>>> +<http://www.gnu.org/licenses/>. */
>>> +
>>> +#include "cpuid.h"
>>> +#include "tsystem.h"
>>> +
>>> +int __cpu_indicator_init (void) __attribute__ ((constructor (101)));
>>> +
>>> +enum vendor_signatures
>>> +{
>>> + SIG_INTEL = 0x756e6547 /* Genu */,
>>> + SIG_AMD = 0x68747541 /* Auth */
>>> +};
>>> +
>>> +/* ISA Features supported. */
>>> +
>>> +struct __processor_features
>>> +{
>>> + unsigned int __cpu_cmov : 1;
>>> + unsigned int __cpu_mmx : 1;
>>> + unsigned int __cpu_popcnt : 1;
>>> + unsigned int __cpu_sse : 1;
>>> + unsigned int __cpu_sse2 : 1;
>>> + unsigned int __cpu_sse3 : 1;
>>> + unsigned int __cpu_ssse3 : 1;
>>> + unsigned int __cpu_sse4_1 : 1;
>>> + unsigned int __cpu_sse4_2 : 1;
>>> +} __cpu_features;
>>> +
>>> +/* Processor Model. */
>>> +
>>> +struct __processor_model
>>> +{
>>> + /* Vendor. */
>>> + unsigned int __cpu_is_amd : 1;
>>> + unsigned int __cpu_is_intel : 1;
>>> + /* CPU type. */
>>> + unsigned int __cpu_is_intel_atom : 1;
>>> + unsigned int __cpu_is_intel_core2 : 1;
>>> + unsigned int __cpu_is_intel_corei7 : 1;
>>> + unsigned int __cpu_is_intel_corei7_nehalem : 1;
>>> + unsigned int __cpu_is_intel_corei7_westmere : 1;
>>> + unsigned int __cpu_is_intel_corei7_sandybridge : 1;
>>> + unsigned int __cpu_is_amdfam10h : 1;
>>> + unsigned int __cpu_is_amdfam10h_barcelona : 1;
>>> + unsigned int __cpu_is_amdfam10h_shanghai : 1;
>>> + unsigned int __cpu_is_amdfam10h_istanbul : 1;
>>> + unsigned int __cpu_is_amdfam15h_bdver1 : 1;
>>> + unsigned int __cpu_is_amdfam15h_bdver2 : 1;
>>> +} __cpu_model;
>>> +
>>> +/* Get the specific type of AMD CPU. */
>>> +
>>> +static void
>>> +get_amd_cpu (unsigned int family, unsigned int model)
>>> +{
>>> + switch (family)
>>> + {
>>> + /* AMD Family 10h. */
>>> + case 0x10:
>>> + switch (model)
>>> + {
>>> + case 0x2:
>>> + /* Barcelona. */
>>> + __cpu_model.__cpu_is_amdfam10h = 1;
>>> + __cpu_model.__cpu_is_amdfam10h_barcelona = 1;
>>> + break;
>>> + case 0x4:
>>> + /* Shanghai. */
>>> + __cpu_model.__cpu_is_amdfam10h = 1;
>>> + __cpu_model.__cpu_is_amdfam10h_shanghai = 1;
>>> + break;
>>> + case 0x8:
>>> + /* Istanbul. */
>>> + __cpu_model.__cpu_is_amdfam10h = 1;
>>> + __cpu_model.__cpu_is_amdfam10h_istanbul = 1;
>>> + break;
>>> + default:
>>> + break;
>>> + }
>>> + break;
>>> + /* AMD Family 15h. */
>>> + case 0x15:
>>> + /* Bulldozer version 1. */
>>> + if (model >= 0 && model <= 0xf)
>>> + __cpu_model.__cpu_is_amdfam15h_bdver1 = 1;
>>> + /* Bulldozer version 2. */
>>> + if (model >= 0x10 && model <= 0x1f)
>>> + __cpu_model.__cpu_is_amdfam15h_bdver2 = 1;
>>> + break;
>>> + default:
>>> + break;
>>> + }
>>> +}
>>> +
>>> +/* Get the specific type of Intel CPU. */
>>> +
>>> +static void
>>> +get_intel_cpu (unsigned int family, unsigned int model, unsigned int brand_id)
>>> +{
>>> + /* Parse family and model only if brand ID is 0. */
>>> + if (brand_id == 0)
>>> + {
>>> + switch (family)
>>> + {
>>> + case 0x5:
>>> + /* Pentium. */
>>> + break;
>>> + case 0x6:
>>> + switch (model)
>>> + {
>>> + case 0x1c:
>>> + case 0x26:
>>> + /* Atom. */
>>> + __cpu_model.__cpu_is_intel_atom = 1;
>>> + break;
>>> + case 0x1a:
>>> + case 0x1e:
>>> + case 0x1f:
>>> + case 0x2e:
>>> + /* Nehalem. */
>>> + __cpu_model.__cpu_is_intel_corei7 = 1;
>>> + __cpu_model.__cpu_is_intel_corei7_nehalem = 1;
>>> + break;
>>> + case 0x25:
>>> + case 0x2c:
>>> + case 0x2f:
>>> + /* Westmere. */
>>> + __cpu_model.__cpu_is_intel_corei7 = 1;
>>> + __cpu_model.__cpu_is_intel_corei7_westmere = 1;
>>> + break;
>>> + case 0x2a:
>>> + /* Sandy Bridge. */
>>> + __cpu_model.__cpu_is_intel_corei7 = 1;
>>> + __cpu_model.__cpu_is_intel_corei7_sandybridge = 1;
>>> + break;
>>> + case 0x17:
>>> + case 0x1d:
>>> + /* Penryn. */
>>> + case 0x0f:
>>> + /* Merom. */
>>> + __cpu_model.__cpu_is_intel_core2 = 1;
>>> + break;
>>> + default:
>>> + break;
>>> + }
>>> + break;
>>> + default:
>>> + /* We have no idea. */
>>> + break;
>>> + }
>>> + }
>>> +}
>>> +
>>> +static void
>>> +get_available_features (unsigned int ecx, unsigned int edx)
>>> +{
>>> + __cpu_features.__cpu_cmov = (edx & bit_CMOV) ? 1 : 0;
>>> + __cpu_features.__cpu_mmx = (edx & bit_MMX) ? 1 : 0;
>>> + __cpu_features.__cpu_sse = (edx & bit_SSE) ? 1 : 0;
>>> + __cpu_features.__cpu_sse2 = (edx & bit_SSE2) ? 1 : 0;
>>> + __cpu_features.__cpu_popcnt = (ecx & bit_POPCNT) ? 1 : 0;
>>> + __cpu_features.__cpu_sse3 = (ecx & bit_SSE3) ? 1 : 0;
>>> + __cpu_features.__cpu_ssse3 = (ecx & bit_SSSE3) ? 1 : 0;
>>> + __cpu_features.__cpu_sse4_1 = (ecx & bit_SSE4_1) ? 1 : 0;
>>> + __cpu_features.__cpu_sse4_2 = (ecx & bit_SSE4_2) ? 1 : 0;
>>> +}
>>> +
>>> +
>>> +/* Sanity check for the vendor and cpu type flags. */
>>> +
>>> +static int
>>> +sanity_check (void)
>>> +{
>>> + unsigned int one_type = 0;
>>> +
>>> + /* Vendor cannot be Intel and AMD. */
>>> + gcc_assert((__cpu_model.__cpu_is_intel == 0)
>>> + || (__cpu_model.__cpu_is_amd == 0));
>>> +
>>> + /* Only one CPU type can be set. */
>>> + one_type = (__cpu_model.__cpu_is_intel_atom
>>> + + __cpu_model.__cpu_is_intel_core2
>>> + + __cpu_model.__cpu_is_intel_corei7_nehalem
>>> + + __cpu_model.__cpu_is_intel_corei7_westmere
>>> + + __cpu_model.__cpu_is_intel_corei7_sandybridge
>>> + + __cpu_model.__cpu_is_amdfam10h_barcelona
>>> + + __cpu_model.__cpu_is_amdfam10h_shanghai
>>> + + __cpu_model.__cpu_is_amdfam10h_istanbul
>>> + + __cpu_model.__cpu_is_amdfam15h_bdver1
>>> + + __cpu_model.__cpu_is_amdfam15h_bdver2);
>>> +
>>> + gcc_assert (one_type <= 1);
>>> + return 0;
>>> +}
>>> +
>>> +/* A noinline function calling __get_cpuid. Having many calls to
>>> + cpuid in one function in 32-bit mode causes GCC to complain:
>>> + "can’t find a register in class ‘CLOBBERED_REGS’". This is
>>> + related to PR rtl-optimization 44174. */
>>> +
>>> +static int __attribute__ ((noinline))
>>> +__get_cpuid_output (unsigned int __level,
>>> + unsigned int *__eax, unsigned int *__ebx,
>>> + unsigned int *__ecx, unsigned int *__edx)
>>> +{
>>> + return __get_cpuid (__level, __eax, __ebx, __ecx, __edx);
>>> +}
>>> +
>>> +
>>> +/* A constructor function that is sets __cpu_model and __cpu_features with
>>> + the right values. This needs to run only once. This constructor is
>>> + given the highest priority and it should run before constructors without
>>> + the priority set. However, it still runs after ifunc initializers and
>>> + needs to be called explicitly there. */
>>> +
>>> +int __attribute__ ((constructor (101)))
>>> +__cpu_indicator_init (void)
>>> +{
>>> + unsigned int eax, ebx, ecx, edx;
>>> +
>>> + int max_level = 5;
>>> + unsigned int vendor;
>>> + unsigned int model, family, brand_id;
>>> + unsigned int extended_model, extended_family;
>>> + static int called = 0;
>>> +
>>> + /* This function needs to run just once. */
>>> + if (called)
>>> + return 0;
>>> + else
>>> + called = 1;
>>> +
>>> + /* Assume cpuid insn present. Run in level 0 to get vendor id. */
>>> + if (!__get_cpuid_output (0, &eax, &ebx, &ecx, &edx))
>>> + return -1;
>>> +
>>> + vendor = ebx;
>>> + max_level = eax;
>>> +
>>> + if (max_level < 1)
>>> + return -1;
>>> +
>>> + if (!__get_cpuid_output (1, &eax, &ebx, &ecx, &edx))
>>> + return -1;
>>> +
>>> + model = (eax >> 4) & 0x0f;
>>> + family = (eax >> 8) & 0x0f;
>>> + brand_id = ebx & 0xff;
>>> + extended_model = (eax >> 12) & 0xf0;
>>> + extended_family = (eax >> 20) & 0xff;
>>> +
>>> + if (vendor == SIG_INTEL)
>>> + {
>>> + /* Adjust model and family for Intel CPUS. */
>>> + if (family == 0x0f)
>>> + {
>>> + family += extended_family;
>>> + model += extended_model;
>>> + }
>>> + else if (family == 0x06)
>>> + model += extended_model;
>>> +
>>> + /* Get CPU type. */
>>> + __cpu_model.__cpu_is_intel = 1;
>>> + get_intel_cpu (family, model, brand_id);
>>> + }
>>> +
>>> + if (vendor == SIG_AMD)
>>> + {
>>> + /* Adjust model and family for AMD CPUS. */
>>> + if (family == 0x0f)
>>> + {
>>> + family += extended_family;
>>> + model += (extended_model << 4);
>>> + }
>>> +
>>> + /* Get CPU type. */
>>> + __cpu_model.__cpu_is_amd = 1;
>>> + get_amd_cpu (family, model);
>>> + }
>>> +
>>> + /* Find available features. */
>>> + get_available_features (ecx, edx);
>>> +
>>> + sanity_check ();
>>> +
>>> + return 0;
>>> +}
>>> Index: libgcc/config/i386/libgcc-glibc.ver
>>> ===================================================================
>>> --- libgcc/config/i386/libgcc-glibc.ver (revision 184971)
>>> +++ libgcc/config/i386/libgcc-glibc.ver (working copy)
>>> @@ -147,6 +147,11 @@ GCC_4.3.0 {
>>> __trunctfxf2
>>> __unordtf2
>>> }
>>> +
>>> +GCC_4.8.0 {
>>> + __cpu_model
>>> + __cpu_features
>>> +}
>>> %else
>>> GCC_4.4.0 {
>>> __addtf3
>>> @@ -183,4 +188,9 @@ GCC_4.4.0 {
>>> GCC_4.5.0 {
>>> __extendxftf2
>>> }
>>> +
>>> +GCC_4.8.0 {
>>> + __cpu_model
>>> + __cpu_features
>>> +}
>>> %endif
>>> Index: gcc/config/i386/i386-builtin-types.def
>>> ===================================================================
>>> --- gcc/config/i386/i386-builtin-types.def (revision 184971)
>>> +++ gcc/config/i386/i386-builtin-types.def (working copy)
>>> @@ -143,6 +143,7 @@ DEF_FUNCTION_TYPE (UINT64)
>>> DEF_FUNCTION_TYPE (UNSIGNED)
>>> DEF_FUNCTION_TYPE (VOID)
>>> DEF_FUNCTION_TYPE (PVOID)
>>> +DEF_FUNCTION_TYPE (INT)
>>>
>>> DEF_FUNCTION_TYPE (FLOAT, FLOAT)
>>> DEF_FUNCTION_TYPE (FLOAT128, FLOAT128)
>>> Index: gcc/config/i386/i386.c
>>> ===================================================================
>>> --- gcc/config/i386/i386.c (revision 184971)
>>> +++ gcc/config/i386/i386.c (working copy)
>>> @@ -25637,6 +25637,33 @@ enum ix86_builtins
>>> /* CFString built-in for darwin */
>>> IX86_BUILTIN_CFSTRING,
>>>
>>> + /* Builtins to get CPU features. */
>>> + IX86_BUILTIN_CPU_SUPPORTS_CMOV,
>>> + IX86_BUILTIN_CPU_SUPPORTS_MMX,
>>> + IX86_BUILTIN_CPU_SUPPORTS_POPCOUNT,
>>> + IX86_BUILTIN_CPU_SUPPORTS_SSE,
>>> + IX86_BUILTIN_CPU_SUPPORTS_SSE2,
>>> + IX86_BUILTIN_CPU_SUPPORTS_SSE3,
>>> + IX86_BUILTIN_CPU_SUPPORTS_SSSE3,
>>> + IX86_BUILTIN_CPU_SUPPORTS_SSE4_1,
>>> + IX86_BUILTIN_CPU_SUPPORTS_SSE4_2,
>>> + /* Builtins to get CPU type. */
>>> + IX86_BUILTIN_CPU_INIT,
>>> + IX86_BUILTIN_CPU_IS_AMD,
>>> + IX86_BUILTIN_CPU_IS_INTEL,
>>> + IX86_BUILTIN_CPU_IS_INTEL_ATOM,
>>> + IX86_BUILTIN_CPU_IS_INTEL_CORE2,
>>> + IX86_BUILTIN_CPU_IS_INTEL_COREI7,
>>> + IX86_BUILTIN_CPU_IS_INTEL_COREI7_NEHALEM,
>>> + IX86_BUILTIN_CPU_IS_INTEL_COREI7_WESTMERE,
>>> + IX86_BUILTIN_CPU_IS_INTEL_COREI7_SANDYBRIDGE,
>>> + IX86_BUILTIN_CPU_IS_AMDFAM10H,
>>> + IX86_BUILTIN_CPU_IS_AMDFAM10H_BARCELONA,
>>> + IX86_BUILTIN_CPU_IS_AMDFAM10H_SHANGHAI,
>>> + IX86_BUILTIN_CPU_IS_AMDFAM10H_ISTANBUL,
>>> + IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER1,
>>> + IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER2,
>>> +
>>> IX86_BUILTIN_MAX
>>> };
>>>
>>> @@ -27446,6 +27473,593 @@ ix86_init_mmx_sse_builtins (void)
>>> }
>>> }
>>>
>>> +/* Returns a struct type with name NAME and number of fields equal to
>>> + NUM_FIELDS. Each field is a unsigned int bit field of length 1 bit. */
>>> +
>>> +static tree
>>> +build_struct_with_one_bit_fields (int num_fields, const char *name)
>>> +{
>>> + int i;
>>> + char field_name [10];
>>> + tree field = NULL_TREE, field_chain = NULL_TREE;
>>> + tree type = make_node (RECORD_TYPE);
>>> +
>>> + strcpy (field_name, "k_field");
>>> +
>>> + for (i = 0; i < num_fields; i++)
>>> + {
>>> + /* Name the fields, 0_field, 1_field, ... */
>>> + field_name [0] = '0' + i;
>>> + field = build_decl (UNKNOWN_LOCATION, FIELD_DECL,
>>> + get_identifier (field_name), unsigned_type_node);
>>> + DECL_BIT_FIELD (field) = 1;
>>> + DECL_SIZE (field) = bitsize_one_node;
>>> + if (field_chain != NULL_TREE)
>>> + DECL_CHAIN (field) = field_chain;
>>> + field_chain = field;
>>> + }
>>> + finish_builtin_struct (type, name, field_chain, NULL_TREE);
>>> + return type;
>>> +}
>>> +
>>> +/* Returns a extern, comdat VAR_DECL of type TYPE and name NAME. */
>>> +
>>> +static tree
>>> +make_var_decl (tree type, const char *name)
>>> +{
>>> + tree new_decl;
>>> + struct varpool_node *vnode;
>>> +
>>> + new_decl = build_decl (UNKNOWN_LOCATION,
>>> + VAR_DECL,
>>> + get_identifier(name),
>>> + type);
>>> +
>>> + DECL_EXTERNAL (new_decl) = 1;
>>> + TREE_STATIC (new_decl) = 1;
>>> + TREE_PUBLIC (new_decl) = 1;
>>> + DECL_INITIAL (new_decl) = 0;
>>> + DECL_ARTIFICIAL (new_decl) = 0;
>>> + DECL_PRESERVE_P (new_decl) = 1;
>>> +
>>> + make_decl_one_only (new_decl, DECL_ASSEMBLER_NAME (new_decl));
>>> + assemble_variable (new_decl, 0, 0, 0);
>>> +
>>> + vnode = varpool_node (new_decl);
>>> + gcc_assert (vnode != NULL);
>>> + /* Set finalized to 1, otherwise it asserts in function "write_symbol" in
>>> + lto-streamer-out.c. */
>>> + vnode->finalized = 1;
>>> +
>>> + return new_decl;
>>> +}
>>> +
>>> +/* Traverses the chain of fields in STRUCT_TYPE and returns the FIELD_NUM
>>> + numbered field. */
>>> +
>>> +static tree
>>> +get_field_from_struct (tree struct_type, int field_num)
>>> +{
>>> + int i;
>>> + tree field = TYPE_FIELDS (struct_type);
>>> +
>>> + for (i = 0; i < field_num; i++, field = DECL_CHAIN(field))
>>> + {
>>> + gcc_assert (field != NULL_TREE);
>>> + }
>>> +
>>> + return field;
>>> +}
>>> +
>>> +/* FNDECL is a __builtin_cpu_* call that is folded into an integer defined
>>> + in libgcc/config/i386/i386-cpuinfo.c */
>>> +
>>> +static tree
>>> +fold_builtin_cpu (enum ix86_builtins fn_code)
>>> +{
>>> + /* This is the order of bit-fields in __processor_features in
>>> + i386-cpuinfo.c */
>>> + enum processor_features
>>> + {
>>> + F_CMOV = 0,
>>> + F_MMX,
>>> + F_POPCNT,
>>> + F_SSE,
>>> + F_SSE2,
>>> + F_SSE3,
>>> + F_SSSE3,
>>> + F_SSE4_1,
>>> + F_SSE4_2,
>>> + F_MAX
>>> + };
>>> +
>>> + /* This is the order of bit-fields in __processor_model in
>>> + i386-cpuinfo.c */
>>> + enum processor_model
>>> + {
>>> + M_AMD = 0,
>>> + M_INTEL,
>>> + M_INTEL_ATOM,
>>> + M_INTEL_CORE2,
>>> + M_INTEL_COREI7,
>>> + M_INTEL_COREI7_NEHALEM,
>>> + M_INTEL_COREI7_WESTMERE,
>>> + M_INTEL_COREI7_SANDYBRIDGE,
>>> + M_AMDFAM10H,
>>> + M_AMDFAM10H_BARCELONA,
>>> + M_AMDFAM10H_SHANGHAI,
>>> + M_AMDFAM10H_ISTANBUL,
>>> + M_AMDFAM15H_BDVER1,
>>> + M_AMDFAM15H_BDVER2,
>>> + M_MAX
>>> + };
>>> +
>>> + static tree __processor_features_type = NULL_TREE;
>>> + static tree __cpu_features_var = NULL_TREE;
>>> + static tree __processor_model_type = NULL_TREE;
>>> + static tree __cpu_model_var = NULL_TREE;
>>> + static tree field;
>>> + static tree which_struct;
>>> +
>>> + if (__processor_features_type == NULL_TREE)
>>> + __processor_features_type = build_struct_with_one_bit_fields (F_MAX,
>>> + "__processor_features");
>>> +
>>> + if (__processor_model_type == NULL_TREE)
>>> + __processor_model_type = build_struct_with_one_bit_fields (M_MAX,
>>> + "__processor_model");
>>> +
>>> + if (__cpu_features_var == NULL_TREE)
>>> + __cpu_features_var = make_var_decl (__processor_features_type,
>>> + "__cpu_features");
>>> +
>>> + if (__cpu_model_var == NULL_TREE)
>>> + __cpu_model_var = make_var_decl (__processor_model_type,
>>> + "__cpu_model");
>>> +
>>> + /* Look at the code to identify the field requested. */
>>> + switch (fn_code)
>>> + {
>>> + case IX86_BUILTIN_CPU_SUPPORTS_CMOV:
>>> + field = get_field_from_struct (__processor_features_type, F_CMOV);
>>> + which_struct = __cpu_features_var;
>>> + break;
>>> + case IX86_BUILTIN_CPU_SUPPORTS_MMX:
>>> + field = get_field_from_struct (__processor_features_type, F_MMX);
>>> + which_struct = __cpu_features_var;
>>> + break;
>>> + case IX86_BUILTIN_CPU_SUPPORTS_POPCOUNT:
>>> + field = get_field_from_struct (__processor_features_type, F_POPCNT);
>>> + which_struct = __cpu_features_var;
>>> + break;
>>> + case IX86_BUILTIN_CPU_SUPPORTS_SSE:
>>> + field = get_field_from_struct (__processor_features_type, F_SSE);
>>> + which_struct = __cpu_features_var;
>>> + break;
>>> + case IX86_BUILTIN_CPU_SUPPORTS_SSE2:
>>> + field = get_field_from_struct (__processor_features_type, F_SSE2);
>>> + which_struct = __cpu_features_var;
>>> + break;
>>> + case IX86_BUILTIN_CPU_SUPPORTS_SSE3:
>>> + field = get_field_from_struct (__processor_features_type, F_SSE3);
>>> + which_struct = __cpu_features_var;
>>> + break;
>>> + case IX86_BUILTIN_CPU_SUPPORTS_SSSE3:
>>> + field = get_field_from_struct (__processor_features_type, F_SSSE3);
>>> + which_struct = __cpu_features_var;
>>> + break;
>>> + case IX86_BUILTIN_CPU_SUPPORTS_SSE4_1:
>>> + field = get_field_from_struct (__processor_features_type, F_SSE4_1);
>>> + which_struct = __cpu_features_var;
>>> + break;
>>> + case IX86_BUILTIN_CPU_SUPPORTS_SSE4_2:
>>> + field = get_field_from_struct (__processor_features_type, F_SSE4_2);
>>> + which_struct = __cpu_features_var;
>>> + break;
>>> + case IX86_BUILTIN_CPU_IS_AMD:
>>> + field = get_field_from_struct (__processor_model_type, M_AMD);
>>> + which_struct = __cpu_model_var;
>>> + break;
>>> + case IX86_BUILTIN_CPU_IS_INTEL:
>>> + field = get_field_from_struct (__processor_model_type, M_INTEL);
>>> + which_struct = __cpu_model_var;
>>> + break;
>>> + case IX86_BUILTIN_CPU_IS_INTEL_ATOM:
>>> + field = get_field_from_struct (__processor_model_type, M_INTEL_ATOM);
>>> + which_struct = __cpu_model_var;
>>> + break;
>>> + case IX86_BUILTIN_CPU_IS_INTEL_CORE2:
>>> + field = get_field_from_struct (__processor_model_type, M_INTEL_CORE2);
>>> + which_struct = __cpu_model_var;
>>> + break;
>>> + case IX86_BUILTIN_CPU_IS_INTEL_COREI7:
>>> + field = get_field_from_struct (__processor_model_type,
>>> + M_INTEL_COREI7);
>>> + which_struct = __cpu_model_var;
>>> + break;
>>> + case IX86_BUILTIN_CPU_IS_INTEL_COREI7_NEHALEM:
>>> + field = get_field_from_struct (__processor_model_type,
>>> + M_INTEL_COREI7_NEHALEM);
>>> + which_struct = __cpu_model_var;
>>> + break;
>>> + case IX86_BUILTIN_CPU_IS_INTEL_COREI7_WESTMERE:
>>> + field = get_field_from_struct (__processor_model_type,
>>> + M_INTEL_COREI7_WESTMERE);
>>> + which_struct = __cpu_model_var;
>>> + break;
>>> + case IX86_BUILTIN_CPU_IS_INTEL_COREI7_SANDYBRIDGE:
>>> + field = get_field_from_struct (__processor_model_type,
>>> + M_INTEL_COREI7_SANDYBRIDGE);
>>> + which_struct = __cpu_model_var;
>>> + break;
>>> + case IX86_BUILTIN_CPU_IS_AMDFAM10H:
>>> + field = get_field_from_struct (__processor_model_type,
>>> + M_AMDFAM10H);
>>> + which_struct = __cpu_model_var;
>>> + break;
>>> + case IX86_BUILTIN_CPU_IS_AMDFAM10H_BARCELONA:
>>> + field = get_field_from_struct (__processor_model_type,
>>> + M_AMDFAM10H_BARCELONA);
>>> + which_struct = __cpu_model_var;
>>> + break;
>>> + case IX86_BUILTIN_CPU_IS_AMDFAM10H_SHANGHAI:
>>> + field = get_field_from_struct (__processor_model_type,
>>> + M_AMDFAM10H_SHANGHAI);
>>> + which_struct = __cpu_model_var;
>>> + break;
>>> + case IX86_BUILTIN_CPU_IS_AMDFAM10H_ISTANBUL:
>>> + field = get_field_from_struct (__processor_model_type,
>>> + M_AMDFAM10H_ISTANBUL);
>>> + which_struct = __cpu_model_var;
>>> + break;
>>> + case IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER1:
>>> + field = get_field_from_struct (__processor_model_type,
>>> + M_AMDFAM15H_BDVER1);
>>> + which_struct = __cpu_model_var;
>>> + break;
>>> + case IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER2:
>>> + field = get_field_from_struct (__processor_model_type,
>>> + M_AMDFAM15H_BDVER2);
>>> + which_struct = __cpu_model_var;
>>> + break;
>>> + default:
>>> + return NULL_TREE;
>>> + }
>>> +
>>> + return build3 (COMPONENT_REF, TREE_TYPE (field), which_struct, field, NULL_TREE);
>>> +}
>>> +
>>> +static tree
>>> +ix86_fold_builtin (tree fndecl, int n_args ATTRIBUTE_UNUSED,
>>> + tree *args ATTRIBUTE_UNUSED, bool ignore ATTRIBUTE_UNUSED)
>>> +{
>>> + const char* decl_name = IDENTIFIER_POINTER (DECL_NAME (fndecl));
>>> + if (DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_MD
>>> + && strstr(decl_name, "__builtin_cpu") != NULL)
>>> + {
>>> + enum ix86_builtins code = (enum ix86_builtins)
>>> + DECL_FUNCTION_CODE (fndecl);
>>> + return fold_builtin_cpu (code);
>>> + }
>>> + return NULL_TREE;
>>> +}
>>> +
>>> +/* A builtin to init/return the cpu type or feature. Returns an
>>> + integer and the type is a const if IS_CONST is set. */
>>> +
>>> +static void
>>> +make_platform_builtin (const char* name, int code, int is_const)
>>> +{
>>> + tree decl;
>>> + tree type;
>>> +
>>> + type = ix86_get_builtin_func_type (INT_FTYPE_VOID);
>>> + decl = add_builtin_function (name, type, code, BUILT_IN_MD,
>>> + NULL, NULL_TREE);
>>> + gcc_assert (decl != NULL_TREE);
>>> + ix86_builtins[(int) code] = decl;
>>> + if (is_const)
>>> + TREE_READONLY (decl) = 1;
>>> +}
>>> +
>>> +/* Builtins to get CPU type and features supported. */
>>> +
>>> +static void
>>> +ix86_init_platform_type_builtins (void)
>>> +{
>>> + make_platform_builtin ("__builtin_cpu_init",
>>> + IX86_BUILTIN_CPU_INIT, 0);
>>> + make_platform_builtin ("__builtin_cpu_supports_cmov",
>>> + IX86_BUILTIN_CPU_SUPPORTS_CMOV, 1);
>>> + make_platform_builtin ("__builtin_cpu_supports_mmx",
>>> + IX86_BUILTIN_CPU_SUPPORTS_MMX, 1);
>>> + make_platform_builtin ("__builtin_cpu_supports_popcount",
>>> + IX86_BUILTIN_CPU_SUPPORTS_POPCOUNT, 1);
>>> + make_platform_builtin ("__builtin_cpu_supports_sse",
>>> + IX86_BUILTIN_CPU_SUPPORTS_SSE, 1);
>>> + make_platform_builtin ("__builtin_cpu_supports_sse2",
>>> + IX86_BUILTIN_CPU_SUPPORTS_SSE2, 1);
>>> + make_platform_builtin ("__builtin_cpu_supports_sse3",
>>> + IX86_BUILTIN_CPU_SUPPORTS_SSE3, 1);
>>> + make_platform_builtin ("__builtin_cpu_supports_ssse3",
>>> + IX86_BUILTIN_CPU_SUPPORTS_SSSE3, 1);
>>> + make_platform_builtin ("__builtin_cpu_supports_sse4_1",
>>> + IX86_BUILTIN_CPU_SUPPORTS_SSE4_1, 1);
>>> + make_platform_builtin ("__builtin_cpu_supports_sse4_2",
>>> + IX86_BUILTIN_CPU_SUPPORTS_SSE4_2, 1);
>>> + make_platform_builtin ("__builtin_cpu_is_amd",
>>> + IX86_BUILTIN_CPU_IS_AMD, 1);
>>> + make_platform_builtin ("__builtin_cpu_is_intel_atom",
>>> + IX86_BUILTIN_CPU_IS_INTEL_ATOM, 1);
>>> + make_platform_builtin ("__builtin_cpu_is_intel_core2",
>>> + IX86_BUILTIN_CPU_IS_INTEL_CORE2, 1);
>>> + make_platform_builtin ("__builtin_cpu_is_intel",
>>> + IX86_BUILTIN_CPU_IS_INTEL, 1);
>>> + make_platform_builtin ("__builtin_cpu_is_intel_corei7",
>>> + IX86_BUILTIN_CPU_IS_INTEL_COREI7, 1);
>>> + make_platform_builtin ("__builtin_cpu_is_intel_corei7_nehalem",
>>> + IX86_BUILTIN_CPU_IS_INTEL_COREI7_NEHALEM, 1);
>>> + make_platform_builtin ("__builtin_cpu_is_intel_corei7_westmere",
>>> + IX86_BUILTIN_CPU_IS_INTEL_COREI7_WESTMERE, 1);
>>> + make_platform_builtin ("__builtin_cpu_is_intel_corei7_sandybridge",
>>> + IX86_BUILTIN_CPU_IS_INTEL_COREI7_SANDYBRIDGE, 1);
>>> + make_platform_builtin ("__builtin_cpu_is_amdfam10",
>>> + IX86_BUILTIN_CPU_IS_AMDFAM10H, 1);
>>> + make_platform_builtin ("__builtin_cpu_is_amdfam10_barcelona",
>>> + IX86_BUILTIN_CPU_IS_AMDFAM10H_BARCELONA, 1);
>>> + make_platform_builtin ("__builtin_cpu_is_amdfam10_shanghai",
>>> + IX86_BUILTIN_CPU_IS_AMDFAM10H_SHANGHAI, 1);
>>> + make_platform_builtin ("__builtin_cpu_is_amdfam10_istanbul",
>>> + IX86_BUILTIN_CPU_IS_AMDFAM10H_ISTANBUL, 1);
>>> + make_platform_builtin ("__builtin_cpu_is_amdfam15_bdver1",
>>> + IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER1, 1);
>>> + make_platform_builtin ("__builtin_cpu_is_amdfam15_bdver2",
>>> + IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER2, 1);
>>> +}
>>> +
>>> /* Internal method for ix86_init_builtins. */
>>>
>>> static void
>>> @@ -27529,6 +28143,9 @@ ix86_init_builtins (void)
>>>
>>> ix86_init_builtin_types ();
>>>
>>> + /* Builtins to get CPU type and features. */
>>> + ix86_init_platform_type_builtins ();
>>> +
>>> /* TFmode support builtins. */
>>> def_builtin_const (0, "__builtin_infq",
>>> FLOAT128_FTYPE_VOID, IX86_BUILTIN_INFQ);
>>> @@ -29145,6 +29762,48 @@ ix86_expand_builtin (tree exp, rtx target, rtx sub
>>> enum machine_mode mode0, mode1, mode2, mode3, mode4;
>>> unsigned int fcode = DECL_FUNCTION_CODE (fndecl);
>>>
>>> + /* For CPU builtins that can be folded, fold first and expand the fold. */
>>> + switch (fcode)
>>> + {
>>> + case IX86_BUILTIN_CPU_SUPPORTS_CMOV:
>>> + case IX86_BUILTIN_CPU_SUPPORTS_MMX:
>>> + case IX86_BUILTIN_CPU_SUPPORTS_POPCOUNT:
>>> + case IX86_BUILTIN_CPU_SUPPORTS_SSE:
>>> + case IX86_BUILTIN_CPU_SUPPORTS_SSE2:
>>> + case IX86_BUILTIN_CPU_SUPPORTS_SSE3:
>>> + case IX86_BUILTIN_CPU_SUPPORTS_SSSE3:
>>> + case IX86_BUILTIN_CPU_SUPPORTS_SSE4_1:
>>> + case IX86_BUILTIN_CPU_SUPPORTS_SSE4_2:
>>> + case IX86_BUILTIN_CPU_IS_AMD:
>>> + case IX86_BUILTIN_CPU_IS_INTEL:
>>> + case IX86_BUILTIN_CPU_IS_INTEL_ATOM:
>>> + case IX86_BUILTIN_CPU_IS_INTEL_CORE2:
>>> + case IX86_BUILTIN_CPU_IS_INTEL_COREI7:
>>> + case IX86_BUILTIN_CPU_IS_INTEL_COREI7_NEHALEM:
>>> + case IX86_BUILTIN_CPU_IS_INTEL_COREI7_WESTMERE:
>>> + case IX86_BUILTIN_CPU_IS_INTEL_COREI7_SANDYBRIDGE:
>>> + case IX86_BUILTIN_CPU_IS_AMDFAM10H:
>>> + case IX86_BUILTIN_CPU_IS_AMDFAM10H_BARCELONA:
>>> + case IX86_BUILTIN_CPU_IS_AMDFAM10H_SHANGHAI:
>>> + case IX86_BUILTIN_CPU_IS_AMDFAM10H_ISTANBUL:
>>> + case IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER1:
>>> + case IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER2:
>>> + {
>>> + tree fold_expr = fold_builtin_cpu ((enum ix86_builtins) fcode);
>>> + gcc_assert (fold_expr != NULL_TREE);
>>> + return expand_expr (fold_expr, target, mode, EXPAND_NORMAL);
>>> + }
>>> + case IX86_BUILTIN_CPU_INIT:
>>> + {
>>> + /* Make it call __cpu_indicator_init in libgcc. */
>>> + tree call_expr, fndecl, type;
>>> + type = build_function_type_list (integer_type_node, NULL_TREE);
>>> + fndecl = build_fn_decl ("__cpu_indicator_init", type);
>>> + call_expr = build_call_expr (fndecl, 0);
>>> + return expand_expr (call_expr, target, mode, EXPAND_NORMAL);
>>> + }
>>> + }
>>> +
>>> /* Determine whether the builtin function is available under the current ISA.
>>> Originally the builtin was not created if it wasn't applicable to the
>>> current ISA based on the command line switches. With function specific
>>> @@ -38610,6 +39269,12 @@ ix86_autovectorize_vector_sizes (void)
>>> #undef TARGET_BUILD_BUILTIN_VA_LIST
>>> #define TARGET_BUILD_BUILTIN_VA_LIST ix86_build_builtin_va_list
>>>
>>> +#undef TARGET_FOLD_BUILTIN
>>> +#define TARGET_FOLD_BUILTIN ix86_fold_builtin
>>> +
>>> #undef TARGET_ENUM_VA_LIST_P
>>> #define TARGET_ENUM_VA_LIST_P ix86_enum_va_list
>>> Index: gcc/testsuite/gcc.target/i386/builtin_target.c
>>> ===================================================================
>>> --- gcc/testsuite/gcc.target/i386/builtin_target.c (revision 0)
>>> +++ gcc/testsuite/gcc.target/i386/builtin_target.c (revision 0)
>>> @@ -0,0 +1,61 @@
>>> +/* This test checks if the __builtin_cpu_* calls are recognized. */
>>> +
>>> +/* { dg-do run } */
>>> +
>>> +int
>>> +fn1 ()
>>> +{
>>> + if (__builtin_cpu_supports_cmov () < 0)
>>> + return -1;
>>> + if (__builtin_cpu_supports_mmx () < 0)
>>> + return -1;
>>> + if (__builtin_cpu_supports_popcount () < 0)
>>> + return -1;
>>> + if (__builtin_cpu_supports_sse () < 0)
>>> + return -1;
>>> + if (__builtin_cpu_supports_sse2 () < 0)
>>> + return -1;
>>> + if (__builtin_cpu_supports_sse3 () < 0)
>>> + return -1;
>>> + if (__builtin_cpu_supports_ssse3 () < 0)
>>> + return -1;
>>> + if (__builtin_cpu_supports_sse4_1 () < 0)
>>> + return -1;
>>> + if (__builtin_cpu_supports_sse4_2 () < 0)
>>> + return -1;
>>> + if (__builtin_cpu_is_amd () < 0)
>>> + return -1;
>>> + if (__builtin_cpu_is_intel () < 0)
>>> + return -1;
>>> + if (__builtin_cpu_is_intel_atom () < 0)
>>> + return -1;
>>> + if (__builtin_cpu_is_intel_core2 () < 0)
>>> + return -1;
>>> + if (__builtin_cpu_is_intel_corei7 () < 0)
>>> + return -1;
>>> + if (__builtin_cpu_is_intel_corei7_nehalem () < 0)
>>> + return -1;
>>> + if (__builtin_cpu_is_intel_corei7_westmere () < 0)
>>> + return -1;
>>> + if (__builtin_cpu_is_intel_corei7_sandybridge () < 0)
>>> + return -1;
>>> + if (__builtin_cpu_is_amdfam10 () < 0)
>>> + return -1;
>>> + if (__builtin_cpu_is_amdfam10_barcelona () < 0)
>>> + return -1;
>>> + if (__builtin_cpu_is_amdfam10_shanghai () < 0)
>>> + return -1;
>>> + if (__builtin_cpu_is_amdfam10_istanbul () < 0)
>>> + return -1;
>>> + if (__builtin_cpu_is_amdfam15_bdver1 () < 0)
>>> + return -1;
>>> + if (__builtin_cpu_is_amdfam15_bdver2 () < 0)
>>> + return -1;
>>> +
>>> + return 0;
>>> +}
>>> +
>>> +int main ()
>>> +{
>>> + return fn1 ();
>>> +}
>>>
>>> --
>>> This patch is available for review at http://codereview.appspot.com/5754058
Message from unknown
2012-03-30T00:09:35+00:00Sriramanurn:md5:bfd83aeea9ae4cc62adbfaece585b667
Message from tmsriram@google.com
2012-03-30T00:13:13+00:00Sriramanurn:md5:c33e8c0697a2f07f0aa339aee5187741
Subject:Support for Runtime CPU type detection via builtins
Hi,
I have uploaded a new patch to only have two builtins :
* __builtin_cpu_is ("<CPUNAME>")
* __builtin_cpu_supports ("<FEATURE>")
apart from the cpu init builtin, __builtin_cpu_init.
List of CPU names :
* "amd"
* "intel"
* "atom"
* "core2"
* "corei7"
* "nehalem"
* "westmere"
* "sandybridge"
* "amdfam10h"
* "barcelona"
* "shanghai"
* "istanbul"
* "bdver1"
* "bdver2"
List of CPU features :
* "cmov"
* "mmx"
* "popcnt"
* "sse"
* "sse2"
* "sse3"
* "ssse3"
* "sse4.1"
* "sse4.2"
As an example, to check if CPU is corei7, call __builtin_cpu_is ("corei7")
Comments?
Thanks.
On Mon, Mar 12, 2012 at 4:16 AM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Thu, Mar 8, 2012 at 9:35 PM, Xinliang David Li <davidxl@google.com> wrote:
>> On Wed, Mar 7, 2012 at 5:51 AM, Richard Guenther
>> <richard.guenther@gmail.com> wrote:
>>> On Wed, Mar 7, 2012 at 1:49 AM, Sriraman Tallam <tmsriram@google.com> wrote:
>>>> Patch for CPU detection at run-time.
>>>> ===================================
>>>>
>>>> Patch for CPU detection at run-time, to be used in dispatching of
>>>> multi-versioned functions. Please see this discussion:
>>>> http://gcc.gnu.org/ml/gcc-patches/2011-08/msg01355.html
>>>> when this patch for reviewed the last time.
>>>>
>>>> For more detailed description:
>>>> http://gcc.gnu.org/ml/gcc/2012-03/msg00074.html
>>>>
>>>> One of the main concerns was about making CPU detection initialization a
>>>> constructor. The main point raised was about constructor ordering. I have
>>>> added a priority value to the CPU detection constructor to make it very high
>>>> priority so that it is guaranteed to fire before every constructor without
>>>> an explicitly marked priority value of 101. However, IFUNC initializers
>>>> will still fire before this constructor, so the cpu initialization routine
>>>> has to be explicitly called in such initializers for which I have added a
>>>> builtin: __builtin_cpu_init ().
>>>>
>>>> This patch adds the following new builtins:
>>>>
>>>> * __builtin_cpu_init
>>>> * __builtin_cpu_supports_cmov
>>>> * __builtin_cpu_supports_mmx
>>>> * __builtin_cpu_supports_popcount
>>>> * __builtin_cpu_supports_sse
>>>> * __builtin_cpu_supports_sse2
>>>> * __builtin_cpu_supports_sse3
>>>> * __builtin_cpu_supports_ssse3
>>>> * __builtin_cpu_supports_sse4_1
>>>> * __builtin_cpu_supports_sse4_2
>>>> * __builtin_cpu_is_amd
>>>> * __builtin_cpu_is_intel_atom
>>>> * __builtin_cpu_is_intel_core2
>>>> * __builtin_cpu_is_intel
>>>> * __builtin_cpu_is_intel_corei7
>>>> * __builtin_cpu_is_intel_corei7_nehalem
>>>> * __builtin_cpu_is_intel_corei7_westmere
>>>> * __builtin_cpu_is_intel_corei7_sandybridge
>>>> * __builtin_cpu_is_amdfam10
>>>> * __builtin_cpu_is_amdfam10_barcelona
>>>> * __builtin_cpu_is_amdfam10_shanghai
>>>> * __builtin_cpu_is_amdfam10_istanbul
>>>> * __builtin_cpu_is_amdfam15_bdver1
>>>> * __builtin_cpu_is_amdfam15_bdver2
>>>
>>> I think the non-feature detection functions are not necessary at all.
>>
>> They are useful if compiler needs to do auto versioning based on cpu model.
>>
>>> Builtin functions are not exactly cheap, nor is the scheme you invent
>>> backward/forward compatible. Instead, why not add a single builtin
>>> function, __builtin_cpu_supports(const char *), and decode from
>>> a comma-separated list of features? Unknown features are simply
>>> "not present". So I can write code with only a single configure check,
>>
>> This is a good idea.
>>
>> __builtin_is_cpu (const char* );
>> __builtin_cpu_supports (char char*);
>
> That looks good to me.
>
> Richard.
>
>> thanks,
>>
>> David
>>
>>
>>> for __builtin_cpu_supports, and cater for future features or older compilers.
>>>
>>> And of course that builtin would be even cross-platform.
>>>
>>> Implementation-wise I'll leave this to x86 maintainers to comment on.
>>>
>>> Richard.
>>>
>>>>
>>>> * config/i386/i386.c (build_struct_with_one_bit_fields): New function.
>>>> (make_var_decl): New function.
>>>> (get_field_from_struct): New function.
>>>> (fold_builtin_target): New function.
>>>> (ix86_fold_builtin): New function.
>>>> (ix86_expand_builtin): Expand new builtins by folding them.
>>>> (make_platform_builtin): New functions.
>>>> (ix86_init_platform_type_builtins): Make the new builtins.
>>>> (ix86_init_builtins): Make new builtins to detect CPU type.
>>>> (TARGET_FOLD_BUILTIN): New macro.
>>>> (IX86_BUILTIN_CPU_SUPPORTS_CMOV): New enum value.
>>>> (IX86_BUILTIN_CPU_SUPPORTS_MMX): New enum value.
>>>> (IX86_BUILTIN_CPU_SUPPORTS_POPCOUNT): New enum value.
>>>> (IX86_BUILTIN_CPU_SUPPORTS_SSE): New enum value.
>>>> (IX86_BUILTIN_CPU_SUPPORTS_SSE2): New enum value.
>>>> (IX86_BUILTIN_CPU_SUPPORTS_SSE3): New enum value.
>>>> (IX86_BUILTIN_CPU_SUPPORTS_SSSE3): New enum value.
>>>> (IX86_BUILTIN_CPU_SUPPORTS_SSE4_1): New enum value.
>>>> (IX86_BUILTIN_CPU_SUPPORTS_SSE4_2): New enum value.
>>>> (IX86_BUILTIN_CPU_INIT): New enum value.
>>>> (IX86_BUILTIN_CPU_IS_AMD): New enum value.
>>>> (IX86_BUILTIN_CPU_IS_INTEL): New enum value.
>>>> (IX86_BUILTIN_CPU_IS_INTEL_ATOM): New enum value.
>>>> (IX86_BUILTIN_CPU_IS_INTEL_CORE2): New enum value.
>>>> (IX86_BUILTIN_CPU_IS_INTEL_COREI7_NEHALEM): New enum value.
>>>> (IX86_BUILTIN_CPU_IS_INTEL_COREI7_WESTMERE): New enum value.
>>>> (IX86_BUILTIN_CPU_IS_INTEL_COREI7_SANDYBRIDGE): New enum value.
>>>> (IX86_BUILTIN_CPU_IS_AMDFAM10_BARCELONA): New enum value.
>>>> (IX86_BUILTIN_CPU_IS_AMDFAM10_SHANGHAI): New enum value.
>>>> (IX86_BUILTIN_CPU_IS_AMDFAM10_ISTANBUL): New enum value.
>>>> (IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER1): New enum value.
>>>> (IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER2): New enum value.
>>>> * config/i386/i386-builtin-types.def: New function type.
>>>> * testsuite/gcc.target/builtin_target.c: New testcase.
>>>>
>>>> * libgcc/config/i386/i386-cpuinfo.c: New file.
>>>> * libgcc/config/i386/t-cpuinfo: New file.
>>>> * libgcc/config.host: Include t-cpuinfo.
>>>> * libgcc/config/i386/libgcc-glibc.ver: Version symbols __cpu_model
>>>> and __cpu_features.
>>>>
>>>> Index: libgcc/config.host
>>>> ===================================================================
>>>> --- libgcc/config.host (revision 184971)
>>>> +++ libgcc/config.host (working copy)
>>>> @@ -1142,7 +1142,7 @@ i[34567]86-*-linux* | x86_64-*-linux* | \
>>>> i[34567]86-*-kfreebsd*-gnu | x86_64-*-kfreebsd*-gnu | \
>>>> i[34567]86-*-knetbsd*-gnu | \
>>>> i[34567]86-*-gnu*)
>>>> - tmake_file="${tmake_file} t-tls i386/t-linux"
>>>> + tmake_file="${tmake_file} t-tls i386/t-linux i386/t-cpuinfo"
>>>> if test "$libgcc_cv_cfi" = "yes"; then
>>>> tmake_file="${tmake_file} t-stack i386/t-stack-i386"
>>>> fi
>>>> Index: libgcc/config/i386/t-cpuinfo
>>>> ===================================================================
>>>> --- libgcc/config/i386/t-cpuinfo (revision 0)
>>>> +++ libgcc/config/i386/t-cpuinfo (revision 0)
>>>> @@ -0,0 +1 @@
>>>> +LIB2ADD += $(srcdir)/config/i386/i386-cpuinfo.c
>>>> Index: libgcc/config/i386/i386-cpuinfo.c
>>>> ===================================================================
>>>> --- libgcc/config/i386/i386-cpuinfo.c (revision 0)
>>>> +++ libgcc/config/i386/i386-cpuinfo.c (revision 0)
>>>> @@ -0,0 +1,306 @@
>>>> +/* Get CPU type and Features for x86 processors.
>>>> + Copyright (C) 2011 Free Software Foundation, Inc.
>>>> + Contributed by Sriraman Tallam (tmsriram@google.com)
>>>> +
>>>> +This file is part of GCC.
>>>> +
>>>> +GCC is free software; you can redistribute it and/or modify it under
>>>> +the terms of the GNU General Public License as published by the Free
>>>> +Software Foundation; either version 3, or (at your option) any later
>>>> +version.
>>>> +
>>>> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
>>>> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
>>>> +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
>>>> +for more details.
>>>> +
>>>> +You should have received a copy of the GNU General Public License
>>>> +along with GCC; see the file COPYING3. If not see
>>>> +<http://www.gnu.org/licenses/>. */
>>>> +
>>>> +#include "cpuid.h"
>>>> +#include "tsystem.h"
>>>> +
>>>> +int __cpu_indicator_init (void) __attribute__ ((constructor (101)));
>>>> +
>>>> +enum vendor_signatures
>>>> +{
>>>> + SIG_INTEL = 0x756e6547 /* Genu */,
>>>> + SIG_AMD = 0x68747541 /* Auth */
>>>> +};
>>>> +
>>>> +/* ISA Features supported. */
>>>> +
>>>> +struct __processor_features
>>>> +{
>>>> + unsigned int __cpu_cmov : 1;
>>>> + unsigned int __cpu_mmx : 1;
>>>> + unsigned int __cpu_popcnt : 1;
>>>> + unsigned int __cpu_sse : 1;
>>>> + unsigned int __cpu_sse2 : 1;
>>>> + unsigned int __cpu_sse3 : 1;
>>>> + unsigned int __cpu_ssse3 : 1;
>>>> + unsigned int __cpu_sse4_1 : 1;
>>>> + unsigned int __cpu_sse4_2 : 1;
>>>> +} __cpu_features;
>>>> +
>>>> +/* Processor Model. */
>>>> +
>>>> +struct __processor_model
>>>> +{
>>>> + /* Vendor. */
>>>> + unsigned int __cpu_is_amd : 1;
>>>> + unsigned int __cpu_is_intel : 1;
>>>> + /* CPU type. */
>>>> + unsigned int __cpu_is_intel_atom : 1;
>>>> + unsigned int __cpu_is_intel_core2 : 1;
>>>> + unsigned int __cpu_is_intel_corei7 : 1;
>>>> + unsigned int __cpu_is_intel_corei7_nehalem : 1;
>>>> + unsigned int __cpu_is_intel_corei7_westmere : 1;
>>>> + unsigned int __cpu_is_intel_corei7_sandybridge : 1;
>>>> + unsigned int __cpu_is_amdfam10h : 1;
>>>> + unsigned int __cpu_is_amdfam10h_barcelona : 1;
>>>> + unsigned int __cpu_is_amdfam10h_shanghai : 1;
>>>> + unsigned int __cpu_is_amdfam10h_istanbul : 1;
>>>> + unsigned int __cpu_is_amdfam15h_bdver1 : 1;
>>>> + unsigned int __cpu_is_amdfam15h_bdver2 : 1;
>>>> +} __cpu_model;
>>>> +
>>>> +/* Get the specific type of AMD CPU. */
>>>> +
>>>> +static void
>>>> +get_amd_cpu (unsigned int family, unsigned int model)
>>>> +{
>>>> + switch (family)
>>>> + {
>>>> + /* AMD Family 10h. */
>>>> + case 0x10:
>>>> + switch (model)
>>>> + {
>>>> + case 0x2:
>>>> + /* Barcelona. */
>>>> + __cpu_model.__cpu_is_amdfam10h = 1;
>>>> + __cpu_model.__cpu_is_amdfam10h_barcelona = 1;
>>>> + break;
>>>> + case 0x4:
>>>> + /* Shanghai. */
>>>> + __cpu_model.__cpu_is_amdfam10h = 1;
>>>> + __cpu_model.__cpu_is_amdfam10h_shanghai = 1;
>>>> + break;
>>>> + case 0x8:
>>>> + /* Istanbul. */
>>>> + __cpu_model.__cpu_is_amdfam10h = 1;
>>>> + __cpu_model.__cpu_is_amdfam10h_istanbul = 1;
>>>> + break;
>>>> + default:
>>>> + break;
>>>> + }
>>>> + break;
>>>> + /* AMD Family 15h. */
>>>> + case 0x15:
>>>> + /* Bulldozer version 1. */
>>>> + if (model >= 0 && model <= 0xf)
>>>> + __cpu_model.__cpu_is_amdfam15h_bdver1 = 1;
>>>> + /* Bulldozer version 2. */
>>>> + if (model >= 0x10 && model <= 0x1f)
>>>> + __cpu_model.__cpu_is_amdfam15h_bdver2 = 1;
>>>> + break;
>>>> + default:
>>>> + break;
>>>> + }
>>>> +}
>>>> +
>>>> +/* Get the specific type of Intel CPU. */
>>>> +
>>>> +static void
>>>> +get_intel_cpu (unsigned int family, unsigned int model, unsigned int brand_id)
>>>> +{
>>>> + /* Parse family and model only if brand ID is 0. */
>>>> + if (brand_id == 0)
>>>> + {
>>>> + switch (family)
>>>> + {
>>>> + case 0x5:
>>>> + /* Pentium. */
>>>> + break;
>>>> + case 0x6:
>>>> + switch (model)
>>>> + {
>>>> + case 0x1c:
>>>> + case 0x26:
>>>> + /* Atom. */
>>>> + __cpu_model.__cpu_is_intel_atom = 1;
>>>> + break;
>>>> + case 0x1a:
>>>> + case 0x1e:
>>>> + case 0x1f:
>>>> + case 0x2e:
>>>> + /* Nehalem. */
>>>> + __cpu_model.__cpu_is_intel_corei7 = 1;
>>>> + __cpu_model.__cpu_is_intel_corei7_nehalem = 1;
>>>> + break;
>>>> + case 0x25:
>>>> + case 0x2c:
>>>> + case 0x2f:
>>>> + /* Westmere. */
>>>> + __cpu_model.__cpu_is_intel_corei7 = 1;
>>>> + __cpu_model.__cpu_is_intel_corei7_westmere = 1;
>>>> + break;
>>>> + case 0x2a:
>>>> + /* Sandy Bridge. */
>>>> + __cpu_model.__cpu_is_intel_corei7 = 1;
>>>> + __cpu_model.__cpu_is_intel_corei7_sandybridge = 1;
>>>> + break;
>>>> + case 0x17:
>>>> + case 0x1d:
>>>> + /* Penryn. */
>>>> + case 0x0f:
>>>> + /* Merom. */
>>>> + __cpu_model.__cpu_is_intel_core2 = 1;
>>>> + break;
>>>> + default:
>>>> + break;
>>>> + }
>>>> + break;
>>>> + default:
>>>> + /* We have no idea. */
>>>> + break;
>>>> + }
>>>> + }
>>>> +}
>>>> +
>>>> +static void
>>>> +get_available_features (unsigned int ecx, unsigned int edx)
>>>> +{
>>>> + __cpu_features.__cpu_cmov = (edx & bit_CMOV) ? 1 : 0;
>>>> + __cpu_features.__cpu_mmx = (edx & bit_MMX) ? 1 : 0;
>>>> + __cpu_features.__cpu_sse = (edx & bit_SSE) ? 1 : 0;
>>>> + __cpu_features.__cpu_sse2 = (edx & bit_SSE2) ? 1 : 0;
>>>> + __cpu_features.__cpu_popcnt = (ecx & bit_POPCNT) ? 1 : 0;
>>>> + __cpu_features.__cpu_sse3 = (ecx & bit_SSE3) ? 1 : 0;
>>>> + __cpu_features.__cpu_ssse3 = (ecx & bit_SSSE3) ? 1 : 0;
>>>> + __cpu_features.__cpu_sse4_1 = (ecx & bit_SSE4_1) ? 1 : 0;
>>>> + __cpu_features.__cpu_sse4_2 = (ecx & bit_SSE4_2) ? 1 : 0;
>>>> +}
>>>> +
>>>> +
>>>> +/* Sanity check for the vendor and cpu type flags. */
>>>> +
>>>> +static int
>>>> +sanity_check (void)
>>>> +{
>>>> + unsigned int one_type = 0;
>>>> +
>>>> + /* Vendor cannot be Intel and AMD. */
>>>> + gcc_assert((__cpu_model.__cpu_is_intel == 0)
>>>> + || (__cpu_model.__cpu_is_amd == 0));
>>>> +
>>>> + /* Only one CPU type can be set. */
>>>> + one_type = (__cpu_model.__cpu_is_intel_atom
>>>> + + __cpu_model.__cpu_is_intel_core2
>>>> + + __cpu_model.__cpu_is_intel_corei7_nehalem
>>>> + + __cpu_model.__cpu_is_intel_corei7_westmere
>>>> + + __cpu_model.__cpu_is_intel_corei7_sandybridge
>>>> + + __cpu_model.__cpu_is_amdfam10h_barcelona
>>>> + + __cpu_model.__cpu_is_amdfam10h_shanghai
>>>> + + __cpu_model.__cpu_is_amdfam10h_istanbul
>>>> + + __cpu_model.__cpu_is_amdfam15h_bdver1
>>>> + + __cpu_model.__cpu_is_amdfam15h_bdver2);
>>>> +
>>>> + gcc_assert (one_type <= 1);
>>>> + return 0;
>>>> +}
>>>> +
>>>> +/* A noinline function calling __get_cpuid. Having many calls to
>>>> + cpuid in one function in 32-bit mode causes GCC to complain:
>>>> + "can’t find a register in class ‘CLOBBERED_REGS’". This is
>>>> + related to PR rtl-optimization 44174. */
>>>> +
>>>> +static int __attribute__ ((noinline))
>>>> +__get_cpuid_output (unsigned int __level,
>>>> + unsigned int *__eax, unsigned int *__ebx,
>>>> + unsigned int *__ecx, unsigned int *__edx)
>>>> +{
>>>> + return __get_cpuid (__level, __eax, __ebx, __ecx, __edx);
>>>> +}
>>>> +
>>>> +
>>>> +/* A constructor function that is sets __cpu_model and __cpu_features with
>>>> + the right values. This needs to run only once. This constructor is
>>>> + given the highest priority and it should run before constructors without
>>>> + the priority set. However, it still runs after ifunc initializers and
>>>> + needs to be called explicitly there. */
>>>> +
>>>> +int __attribute__ ((constructor (101)))
>>>> +__cpu_indicator_init (void)
>>>> +{
>>>> + unsigned int eax, ebx, ecx, edx;
>>>> +
>>>> + int max_level = 5;
>>>> + unsigned int vendor;
>>>> + unsigned int model, family, brand_id;
>>>> + unsigned int extended_model, extended_family;
>>>> + static int called = 0;
>>>> +
>>>> + /* This function needs to run just once. */
>>>> + if (called)
>>>> + return 0;
>>>> + else
>>>> + called = 1;
>>>> +
>>>> + /* Assume cpuid insn present. Run in level 0 to get vendor id. */
>>>> + if (!__get_cpuid_output (0, &eax, &ebx, &ecx, &edx))
>>>> + return -1;
>>>> +
>>>> + vendor = ebx;
>>>> + max_level = eax;
>>>> +
>>>> + if (max_level < 1)
>>>> + return -1;
>>>> +
>>>> + if (!__get_cpuid_output (1, &eax, &ebx, &ecx, &edx))
>>>> + return -1;
>>>> +
>>>> + model = (eax >> 4) & 0x0f;
>>>> + family = (eax >> 8) & 0x0f;
>>>> + brand_id = ebx & 0xff;
>>>> + extended_model = (eax >> 12) & 0xf0;
>>>> + extended_family = (eax >> 20) & 0xff;
>>>> +
>>>> + if (vendor == SIG_INTEL)
>>>> + {
>>>> + /* Adjust model and family for Intel CPUS. */
>>>> + if (family == 0x0f)
>>>> + {
>>>> + family += extended_family;
>>>> + model += extended_model;
>>>> + }
>>>> + else if (family == 0x06)
>>>> + model += extended_model;
>>>> +
>>>> + /* Get CPU type. */
>>>> + __cpu_model.__cpu_is_intel = 1;
>>>> + get_intel_cpu (family, model, brand_id);
>>>> + }
>>>> +
>>>> + if (vendor == SIG_AMD)
>>>> + {
>>>> + /* Adjust model and family for AMD CPUS. */
>>>> + if (family == 0x0f)
>>>> + {
>>>> + family += extended_family;
>>>> + model += (extended_model << 4);
>>>> + }
>>>> +
>>>> + /* Get CPU type. */
>>>> + __cpu_model.__cpu_is_amd = 1;
>>>> + get_amd_cpu (family, model);
>>>> + }
>>>> +
>>>> + /* Find available features. */
>>>> + get_available_features (ecx, edx);
>>>> +
>>>> + sanity_check ();
>>>> +
>>>> + return 0;
>>>> +}
>>>> Index: libgcc/config/i386/libgcc-glibc.ver
>>>> ===================================================================
>>>> --- libgcc/config/i386/libgcc-glibc.ver (revision 184971)
>>>> +++ libgcc/config/i386/libgcc-glibc.ver (working copy)
>>>> @@ -147,6 +147,11 @@ GCC_4.3.0 {
>>>> __trunctfxf2
>>>> __unordtf2
>>>> }
>>>> +
>>>> +GCC_4.8.0 {
>>>> + __cpu_model
>>>> + __cpu_features
>>>> +}
>>>> %else
>>>> GCC_4.4.0 {
>>>> __addtf3
>>>> @@ -183,4 +188,9 @@ GCC_4.4.0 {
>>>> GCC_4.5.0 {
>>>> __extendxftf2
>>>> }
>>>> +
>>>> +GCC_4.8.0 {
>>>> + __cpu_model
>>>> + __cpu_features
>>>> +}
>>>> %endif
>>>> Index: gcc/config/i386/i386-builtin-types.def
>>>> ===================================================================
>>>> --- gcc/config/i386/i386-builtin-types.def (revision 184971)
>>>> +++ gcc/config/i386/i386-builtin-types.def (working copy)
>>>> @@ -143,6 +143,7 @@ DEF_FUNCTION_TYPE (UINT64)
>>>> DEF_FUNCTION_TYPE (UNSIGNED)
>>>> DEF_FUNCTION_TYPE (VOID)
>>>> DEF_FUNCTION_TYPE (PVOID)
>>>> +DEF_FUNCTION_TYPE (INT)
>>>>
>>>> DEF_FUNCTION_TYPE (FLOAT, FLOAT)
>>>> DEF_FUNCTION_TYPE (FLOAT128, FLOAT128)
>>>> Index: gcc/config/i386/i386.c
>>>> ===================================================================
>>>> --- gcc/config/i386/i386.c (revision 184971)
>>>> +++ gcc/config/i386/i386.c (working copy)
>>>> @@ -25637,6 +25637,33 @@ enum ix86_builtins
>>>> /* CFString built-in for darwin */
>>>> IX86_BUILTIN_CFSTRING,
>>>>
>>>> + /* Builtins to get CPU features. */
>>>> + IX86_BUILTIN_CPU_SUPPORTS_CMOV,
>>>> + IX86_BUILTIN_CPU_SUPPORTS_MMX,
>>>> + IX86_BUILTIN_CPU_SUPPORTS_POPCOUNT,
>>>> + IX86_BUILTIN_CPU_SUPPORTS_SSE,
>>>> + IX86_BUILTIN_CPU_SUPPORTS_SSE2,
>>>> + IX86_BUILTIN_CPU_SUPPORTS_SSE3,
>>>> + IX86_BUILTIN_CPU_SUPPORTS_SSSE3,
>>>> + IX86_BUILTIN_CPU_SUPPORTS_SSE4_1,
>>>> + IX86_BUILTIN_CPU_SUPPORTS_SSE4_2,
>>>> + /* Builtins to get CPU type. */
>>>> + IX86_BUILTIN_CPU_INIT,
>>>> + IX86_BUILTIN_CPU_IS_AMD,
>>>> + IX86_BUILTIN_CPU_IS_INTEL,
>>>> + IX86_BUILTIN_CPU_IS_INTEL_ATOM,
>>>> + IX86_BUILTIN_CPU_IS_INTEL_CORE2,
>>>> + IX86_BUILTIN_CPU_IS_INTEL_COREI7,
>>>> + IX86_BUILTIN_CPU_IS_INTEL_COREI7_NEHALEM,
>>>> + IX86_BUILTIN_CPU_IS_INTEL_COREI7_WESTMERE,
>>>> + IX86_BUILTIN_CPU_IS_INTEL_COREI7_SANDYBRIDGE,
>>>> + IX86_BUILTIN_CPU_IS_AMDFAM10H,
>>>> + IX86_BUILTIN_CPU_IS_AMDFAM10H_BARCELONA,
>>>> + IX86_BUILTIN_CPU_IS_AMDFAM10H_SHANGHAI,
>>>> + IX86_BUILTIN_CPU_IS_AMDFAM10H_ISTANBUL,
>>>> + IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER1,
>>>> + IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER2,
>>>> +
>>>> IX86_BUILTIN_MAX
>>>> };
>>>>
>>>> @@ -27446,6 +27473,593 @@ ix86_init_mmx_sse_builtins (void)
>>>> }
>>>> }
>>>>
>>>> +/* Returns a struct type with name NAME and number of fields equal to
>>>> + NUM_FIELDS. Each field is a unsigned int bit field of length 1 bit. */
>>>> +
>>>> +static tree
>>>> +build_struct_with_one_bit_fields (int num_fields, const char *name)
>>>> +{
>>>> + int i;
>>>> + char field_name [10];
>>>> + tree field = NULL_TREE, field_chain = NULL_TREE;
>>>> + tree type = make_node (RECORD_TYPE);
>>>> +
>>>> + strcpy (field_name, "k_field");
>>>> +
>>>> + for (i = 0; i < num_fields; i++)
>>>> + {
>>>> + /* Name the fields, 0_field, 1_field, ... */
>>>> + field_name [0] = '0' + i;
>>>> + field = build_decl (UNKNOWN_LOCATION, FIELD_DECL,
>>>> + get_identifier (field_name), unsigned_type_node);
>>>> + DECL_BIT_FIELD (field) = 1;
>>>> + DECL_SIZE (field) = bitsize_one_node;
>>>> + if (field_chain != NULL_TREE)
>>>> + DECL_CHAIN (field) = field_chain;
>>>> + field_chain = field;
>>>> + }
>>>> + finish_builtin_struct (type, name, field_chain, NULL_TREE);
>>>> + return type;
>>>> +}
>>>> +
>>>> +/* Returns a extern, comdat VAR_DECL of type TYPE and name NAME. */
>>>> +
>>>> +static tree
>>>> +make_var_decl (tree type, const char *name)
>>>> +{
>>>> + tree new_decl;
>>>> + struct varpool_node *vnode;
>>>> +
>>>> + new_decl = build_decl (UNKNOWN_LOCATION,
>>>> + VAR_DECL,
>>>> + get_identifier(name),
>>>> + type);
>>>> +
>>>> + DECL_EXTERNAL (new_decl) = 1;
>>>> + TREE_STATIC (new_decl) = 1;
>>>> + TREE_PUBLIC (new_decl) = 1;
>>>> + DECL_INITIAL (new_decl) = 0;
>>>> + DECL_ARTIFICIAL (new_decl) = 0;
>>>> + DECL_PRESERVE_P (new_decl) = 1;
>>>> +
>>>> + make_decl_one_only (new_decl, DECL_ASSEMBLER_NAME (new_decl));
>>>> + assemble_variable (new_decl, 0, 0, 0);
>>>> +
>>>> + vnode = varpool_node (new_decl);
>>>> + gcc_assert (vnode != NULL);
>>>> + /* Set finalized to 1, otherwise it asserts in function "write_symbol" in
>>>> + lto-streamer-out.c. */
>>>> + vnode->finalized = 1;
>>>> +
>>>> + return new_decl;
>>>> +}
>>>> +
>>>> +/* Traverses the chain of fields in STRUCT_TYPE and returns the FIELD_NUM
>>>> + numbered field. */
>>>> +
>>>> +static tree
>>>> +get_field_from_struct (tree struct_type, int field_num)
>>>> +{
>>>> + int i;
>>>> + tree field = TYPE_FIELDS (struct_type);
>>>> +
>>>> + for (i = 0; i < field_num; i++, field = DECL_CHAIN(field))
>>>> + {
>>>> + gcc_assert (field != NULL_TREE);
>>>> + }
>>>> +
>>>> + return field;
>>>> +}
>>>> +
>>>> +/* FNDECL is a __builtin_cpu_* call that is folded into an integer defined
>>>> + in libgcc/config/i386/i386-cpuinfo.c */
>>>> +
>>>> +static tree
>>>> +fold_builtin_cpu (enum ix86_builtins fn_code)
>>>> +{
>>>> + /* This is the order of bit-fields in __processor_features in
>>>> + i386-cpuinfo.c */
>>>> + enum processor_features
>>>> + {
>>>> + F_CMOV = 0,
>>>> + F_MMX,
>>>> + F_POPCNT,
>>>> + F_SSE,
>>>> + F_SSE2,
>>>> + F_SSE3,
>>>> + F_SSSE3,
>>>> + F_SSE4_1,
>>>> + F_SSE4_2,
>>>> + F_MAX
>>>> + };
>>>> +
>>>> + /* This is the order of bit-fields in __processor_model in
>>>> + i386-cpuinfo.c */
>>>> + enum processor_model
>>>> + {
>>>> + M_AMD = 0,
>>>> + M_INTEL,
>>>> + M_INTEL_ATOM,
>>>> + M_INTEL_CORE2,
>>>> + M_INTEL_COREI7,
>>>> + M_INTEL_COREI7_NEHALEM,
>>>> + M_INTEL_COREI7_WESTMERE,
>>>> + M_INTEL_COREI7_SANDYBRIDGE,
>>>> + M_AMDFAM10H,
>>>> + M_AMDFAM10H_BARCELONA,
>>>> + M_AMDFAM10H_SHANGHAI,
>>>> + M_AMDFAM10H_ISTANBUL,
>>>> + M_AMDFAM15H_BDVER1,
>>>> + M_AMDFAM15H_BDVER2,
>>>> + M_MAX
>>>> + };
>>>> +
>>>> + static tree __processor_features_type = NULL_TREE;
>>>> + static tree __cpu_features_var = NULL_TREE;
>>>> + static tree __processor_model_type = NULL_TREE;
>>>> + static tree __cpu_model_var = NULL_TREE;
>>>> + static tree field;
>>>> + static tree which_struct;
>>>> +
>>>> + if (__processor_features_type == NULL_TREE)
>>>> + __processor_features_type = build_struct_with_one_bit_fields (F_MAX,
>>>> + "__processor_features");
>>>> +
>>>> + if (__processor_model_type == NULL_TREE)
>>>> + __processor_model_type = build_struct_with_one_bit_fields (M_MAX,
>>>> + "__processor_model");
>>>> +
>>>> + if (__cpu_features_var == NULL_TREE)
>>>> + __cpu_features_var = make_var_decl (__processor_features_type,
>>>> + "__cpu_features");
>>>> +
>>>> + if (__cpu_model_var == NULL_TREE)
>>>> + __cpu_model_var = make_var_decl (__processor_model_type,
>>>> + "__cpu_model");
>>>> +
>>>> + /* Look at the code to identify the field requested. */
>>>> + switch (fn_code)
>>>> + {
>>>> + case IX86_BUILTIN_CPU_SUPPORTS_CMOV:
>>>> + field = get_field_from_struct (__processor_features_type, F_CMOV);
>>>> + which_struct = __cpu_features_var;
>>>> + break;
>>>> + case IX86_BUILTIN_CPU_SUPPORTS_MMX:
>>>> + field = get_field_from_struct (__processor_features_type, F_MMX);
>>>> + which_struct = __cpu_features_var;
>>>> + break;
>>>> + case IX86_BUILTIN_CPU_SUPPORTS_POPCOUNT:
>>>> + field = get_field_from_struct (__processor_features_type, F_POPCNT);
>>>> + which_struct = __cpu_features_var;
>>>> + break;
>>>> + case IX86_BUILTIN_CPU_SUPPORTS_SSE:
>>>> + field = get_field_from_struct (__processor_features_type, F_SSE);
>>>> + which_struct = __cpu_features_var;
>>>> + break;
>>>> + case IX86_BUILTIN_CPU_SUPPORTS_SSE2:
>>>> + field = get_field_from_struct (__processor_features_type, F_SSE2);
>>>> + which_struct = __cpu_features_var;
>>>> + break;
>>>> + case IX86_BUILTIN_CPU_SUPPORTS_SSE3:
>>>> + field = get_field_from_struct (__processor_features_type, F_SSE3);
>>>> + which_struct = __cpu_features_var;
>>>> + break;
>>>> + case IX86_BUILTIN_CPU_SUPPORTS_SSSE3:
>>>> + field = get_field_from_struct (__processor_features_type, F_SSSE3);
>>>> + which_struct = __cpu_features_var;
>>>> + break;
>>>> + case IX86_BUILTIN_CPU_SUPPORTS_SSE4_1:
>>>> + field = get_field_from_struct (__processor_features_type, F_SSE4_1);
>>>> + which_struct = __cpu_features_var;
>>>> + break;
>>>> + case IX86_BUILTIN_CPU_SUPPORTS_SSE4_2:
>>>> + field = get_field_from_struct (__processor_features_type, F_SSE4_2);
>>>> + which_struct = __cpu_features_var;
>>>> + break;
>>>> + case IX86_BUILTIN_CPU_IS_AMD:
>>>> + field = get_field_from_struct (__processor_model_type, M_AMD);
>>>> + which_struct = __cpu_model_var;
>>>> + break;
>>>> + case IX86_BUILTIN_CPU_IS_INTEL:
>>>> + field = get_field_from_struct (__processor_model_type, M_INTEL);
>>>> + which_struct = __cpu_model_var;
>>>> + break;
>>>> + case IX86_BUILTIN_CPU_IS_INTEL_ATOM:
>>>> + field = get_field_from_struct (__processor_model_type, M_INTEL_ATOM);
>>>> + which_struct = __cpu_model_var;
>>>> + break;
>>>> + case IX86_BUILTIN_CPU_IS_INTEL_CORE2:
>>>> + field = get_field_from_struct (__processor_model_type, M_INTEL_CORE2);
>>>> + which_struct = __cpu_model_var;
>>>> + break;
>>>> + case IX86_BUILTIN_CPU_IS_INTEL_COREI7:
>>>> + field = get_field_from_struct (__processor_model_type,
>>>> + M_INTEL_COREI7);
>>>> + which_struct = __cpu_model_var;
>>>> + break;
>>>> + case IX86_BUILTIN_CPU_IS_INTEL_COREI7_NEHALEM:
>>>> + field = get_field_from_struct (__processor_model_type,
>>>> + M_INTEL_COREI7_NEHALEM);
>>>> + which_struct = __cpu_model_var;
>>>> + break;
>>>> + case IX86_BUILTIN_CPU_IS_INTEL_COREI7_WESTMERE:
>>>> + field = get_field_from_struct (__processor_model_type,
>>>> + M_INTEL_COREI7_WESTMERE);
>>>> + which_struct = __cpu_model_var;
>>>> + break;
>>>> + case IX86_BUILTIN_CPU_IS_INTEL_COREI7_SANDYBRIDGE:
>>>> + field = get_field_from_struct (__processor_model_type,
>>>> + M_INTEL_COREI7_SANDYBRIDGE);
>>>> + which_struct = __cpu_model_var;
>>>> + break;
>>>> + case IX86_BUILTIN_CPU_IS_AMDFAM10H:
>>>> + field = get_field_from_struct (__processor_model_type,
>>>> + M_AMDFAM10H);
>>>> + which_struct = __cpu_model_var;
>>>> + break;
>>>> + case IX86_BUILTIN_CPU_IS_AMDFAM10H_BARCELONA:
>>>> + field = get_field_from_struct (__processor_model_type,
>>>> + M_AMDFAM10H_BARCELONA);
>>>> + which_struct = __cpu_model_var;
>>>> + break;
>>>> + case IX86_BUILTIN_CPU_IS_AMDFAM10H_SHANGHAI:
>>>> + field = get_field_from_struct (__processor_model_type,
>>>> + M_AMDFAM10H_SHANGHAI);
>>>> + which_struct = __cpu_model_var;
>>>> + break;
>>>> + case IX86_BUILTIN_CPU_IS_AMDFAM10H_ISTANBUL:
>>>> + field = get_field_from_struct (__processor_model_type,
>>>> + M_AMDFAM10H_ISTANBUL);
>>>> + which_struct = __cpu_model_var;
>>>> + break;
>>>> + case IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER1:
>>>> + field = get_field_from_struct (__processor_model_type,
>>>> + M_AMDFAM15H_BDVER1);
>>>> + which_struct = __cpu_model_var;
>>>> + break;
>>>> + case IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER2:
>>>> + field = get_field_from_struct (__processor_model_type,
>>>> + M_AMDFAM15H_BDVER2);
>>>> + which_struct = __cpu_model_var;
>>>> + break;
>>>> + default:
>>>> + return NULL_TREE;
>>>> + }
>>>> +
>>>> + return build3 (COMPONENT_REF, TREE_TYPE (field), which_struct, field, NULL_TREE);
>>>> +}
>>>> +
>>>> +static tree
>>>> +ix86_fold_builtin (tree fndecl, int n_args ATTRIBUTE_UNUSED,
>>>> + tree *args ATTRIBUTE_UNUSED, bool ignore ATTRIBUTE_UNUSED)
>>>> +{
>>>> + const char* decl_name = IDENTIFIER_POINTER (DECL_NAME (fndecl));
>>>> + if (DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_MD
>>>> + && strstr(decl_name, "__builtin_cpu") != NULL)
>>>> + {
>>>> + enum ix86_builtins code = (enum ix86_builtins)
>>>> + DECL_FUNCTION_CODE (fndecl);
>>>> + return fold_builtin_cpu (code);
>>>> + }
>>>> + return NULL_TREE;
>>>> +}
>>>> +
>>>> +/* A builtin to init/return the cpu type or feature. Returns an
>>>> + integer and the type is a const if IS_CONST is set. */
>>>> +
>>>> +static void
>>>> +make_platform_builtin (const char* name, int code, int is_const)
>>>> +{
>>>> + tree decl;
>>>> + tree type;
>>>> +
>>>> + type = ix86_get_builtin_func_type (INT_FTYPE_VOID);
>>>> + decl = add_builtin_function (name, type, code, BUILT_IN_MD,
>>>> + NULL, NULL_TREE);
>>>> + gcc_assert (decl != NULL_TREE);
>>>> + ix86_builtins[(int) code] = decl;
>>>> + if (is_const)
>>>> + TREE_READONLY (decl) = 1;
>>>> +}
>>>> +
>>>> +/* Builtins to get CPU type and features supported. */
>>>> +
>>>> +static void
>>>> +ix86_init_platform_type_builtins (void)
>>>> +{
>>>> + make_platform_builtin ("__builtin_cpu_init",
>>>> + IX86_BUILTIN_CPU_INIT, 0);
>>>> + make_platform_builtin ("__builtin_cpu_supports_cmov",
>>>> + IX86_BUILTIN_CPU_SUPPORTS_CMOV, 1);
>>>> + make_platform_builtin ("__builtin_cpu_supports_mmx",
>>>> + IX86_BUILTIN_CPU_SUPPORTS_MMX, 1);
>>>> + make_platform_builtin ("__builtin_cpu_supports_popcount",
>>>> + IX86_BUILTIN_CPU_SUPPORTS_POPCOUNT, 1);
>>>> + make_platform_builtin ("__builtin_cpu_supports_sse",
>>>> + IX86_BUILTIN_CPU_SUPPORTS_SSE, 1);
>>>> + make_platform_builtin ("__builtin_cpu_supports_sse2",
>>>> + IX86_BUILTIN_CPU_SUPPORTS_SSE2, 1);
>>>> + make_platform_builtin ("__builtin_cpu_supports_sse3",
>>>> + IX86_BUILTIN_CPU_SUPPORTS_SSE3, 1);
>>>> + make_platform_builtin ("__builtin_cpu_supports_ssse3",
>>>> + IX86_BUILTIN_CPU_SUPPORTS_SSSE3, 1);
>>>> + make_platform_builtin ("__builtin_cpu_supports_sse4_1",
>>>> + IX86_BUILTIN_CPU_SUPPORTS_SSE4_1, 1);
>>>> + make_platform_builtin ("__builtin_cpu_supports_sse4_2",
>>>> + IX86_BUILTIN_CPU_SUPPORTS_SSE4_2, 1);
>>>> + make_platform_builtin ("__builtin_cpu_is_amd",
>>>> + IX86_BUILTIN_CPU_IS_AMD, 1);
>>>> + make_platform_builtin ("__builtin_cpu_is_intel_atom",
>>>> + IX86_BUILTIN_CPU_IS_INTEL_ATOM, 1);
>>>> + make_platform_builtin ("__builtin_cpu_is_intel_core2",
>>>> + IX86_BUILTIN_CPU_IS_INTEL_CORE2, 1);
>>>> + make_platform_builtin ("__builtin_cpu_is_intel",
>>>> + IX86_BUILTIN_CPU_IS_INTEL, 1);
>>>> + make_platform_builtin ("__builtin_cpu_is_intel_corei7",
>>>> + IX86_BUILTIN_CPU_IS_INTEL_COREI7, 1);
>>>> + make_platform_builtin ("__builtin_cpu_is_intel_corei7_nehalem",
>>>> + IX86_BUILTIN_CPU_IS_INTEL_COREI7_NEHALEM, 1);
>>>> + make_platform_builtin ("__builtin_cpu_is_intel_corei7_westmere",
>>>> + IX86_BUILTIN_CPU_IS_INTEL_COREI7_WESTMERE, 1);
>>>> + make_platform_builtin ("__builtin_cpu_is_intel_corei7_sandybridge",
>>>> + IX86_BUILTIN_CPU_IS_INTEL_COREI7_SANDYBRIDGE, 1);
>>>> + make_platform_builtin ("__builtin_cpu_is_amdfam10",
>>>> + IX86_BUILTIN_CPU_IS_AMDFAM10H, 1);
>>>> + make_platform_builtin ("__builtin_cpu_is_amdfam10_barcelona",
>>>> + IX86_BUILTIN_CPU_IS_AMDFAM10H_BARCELONA, 1);
>>>> + make_platform_builtin ("__builtin_cpu_is_amdfam10_shanghai",
>>>> + IX86_BUILTIN_CPU_IS_AMDFAM10H_SHANGHAI, 1);
>>>> + make_platform_builtin ("__builtin_cpu_is_amdfam10_istanbul",
>>>> + IX86_BUILTIN_CPU_IS_AMDFAM10H_ISTANBUL, 1);
>>>> + make_platform_builtin ("__builtin_cpu_is_amdfam15_bdver1",
>>>> + IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER1, 1);
>>>> + make_platform_builtin ("__builtin_cpu_is_amdfam15_bdver2",
>>>> + IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER2, 1);
>>>> +}
>>>> +
>>>> /* Internal method for ix86_init_builtins. */
>>>>
>>>> static void
>>>> @@ -27529,6 +28143,9 @@ ix86_init_builtins (void)
>>>>
>>>> ix86_init_builtin_types ();
>>>>
>>>> + /* Builtins to get CPU type and features. */
>>>> + ix86_init_platform_type_builtins ();
>>>> +
>>>> /* TFmode support builtins. */
>>>> def_builtin_const (0, "__builtin_infq",
>>>> FLOAT128_FTYPE_VOID, IX86_BUILTIN_INFQ);
>>>> @@ -29145,6 +29762,48 @@ ix86_expand_builtin (tree exp, rtx target, rtx sub
>>>> enum machine_mode mode0, mode1, mode2, mode3, mode4;
>>>> unsigned int fcode = DECL_FUNCTION_CODE (fndecl);
>>>>
>>>> + /* For CPU builtins that can be folded, fold first and expand the fold. */
>>>> + switch (fcode)
>>>> + {
>>>> + case IX86_BUILTIN_CPU_SUPPORTS_CMOV:
>>>> + case IX86_BUILTIN_CPU_SUPPORTS_MMX:
>>>> + case IX86_BUILTIN_CPU_SUPPORTS_POPCOUNT:
>>>> + case IX86_BUILTIN_CPU_SUPPORTS_SSE:
>>>> + case IX86_BUILTIN_CPU_SUPPORTS_SSE2:
>>>> + case IX86_BUILTIN_CPU_SUPPORTS_SSE3:
>>>> + case IX86_BUILTIN_CPU_SUPPORTS_SSSE3:
>>>> + case IX86_BUILTIN_CPU_SUPPORTS_SSE4_1:
>>>> + case IX86_BUILTIN_CPU_SUPPORTS_SSE4_2:
>>>> + case IX86_BUILTIN_CPU_IS_AMD:
>>>> + case IX86_BUILTIN_CPU_IS_INTEL:
>>>> + case IX86_BUILTIN_CPU_IS_INTEL_ATOM:
>>>> + case IX86_BUILTIN_CPU_IS_INTEL_CORE2:
>>>> + case IX86_BUILTIN_CPU_IS_INTEL_COREI7:
>>>> + case IX86_BUILTIN_CPU_IS_INTEL_COREI7_NEHALEM:
>>>> + case IX86_BUILTIN_CPU_IS_INTEL_COREI7_WESTMERE:
>>>> + case IX86_BUILTIN_CPU_IS_INTEL_COREI7_SANDYBRIDGE:
>>>> + case IX86_BUILTIN_CPU_IS_AMDFAM10H:
>>>> + case IX86_BUILTIN_CPU_IS_AMDFAM10H_BARCELONA:
>>>> + case IX86_BUILTIN_CPU_IS_AMDFAM10H_SHANGHAI:
>>>> + case IX86_BUILTIN_CPU_IS_AMDFAM10H_ISTANBUL:
>>>> + case IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER1:
>>>> + case IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER2:
>>>> + {
>>>> + tree fold_expr = fold_builtin_cpu ((enum ix86_builtins) fcode);
>>>> + gcc_assert (fold_expr != NULL_TREE);
>>>> + return expand_expr (fold_expr, target, mode, EXPAND_NORMAL);
>>>> + }
>>>> + case IX86_BUILTIN_CPU_INIT:
>>>> + {
>>>> + /* Make it call __cpu_indicator_init in libgcc. */
>>>> + tree call_expr, fndecl, type;
>>>> + type = build_function_type_list (integer_type_node, NULL_TREE);
>>>> + fndecl = build_fn_decl ("__cpu_indicator_init", type);
>>>> + call_expr = build_call_expr (fndecl, 0);
>>>> + return expand_expr (call_expr, target, mode, EXPAND_NORMAL);
>>>> + }
>>>> + }
>>>> +
>>>> /* Determine whether the builtin function is available under the current ISA.
>>>> Originally the builtin was not created if it wasn't applicable to the
>>>> current ISA based on the command line switches. With function specific
>>>> @@ -38610,6 +39269,12 @@ ix86_autovectorize_vector_sizes (void)
>>>> #undef TARGET_BUILD_BUILTIN_VA_LIST
>>>> #define TARGET_BUILD_BUILTIN_VA_LIST ix86_build_builtin_va_list
>>>>
>>>> +#undef TARGET_FOLD_BUILTIN
>>>> +#define TARGET_FOLD_BUILTIN ix86_fold_builtin
>>>> +
>>>> #undef TARGET_ENUM_VA_LIST_P
>>>> #define TARGET_ENUM_VA_LIST_P ix86_enum_va_list
>>>> Index: gcc/testsuite/gcc.target/i386/builtin_target.c
>>>> ===================================================================
>>>> --- gcc/testsuite/gcc.target/i386/builtin_target.c (revision 0)
>>>> +++ gcc/testsuite/gcc.target/i386/builtin_target.c (revision 0)
>>>> @@ -0,0 +1,61 @@
>>>> +/* This test checks if the __builtin_cpu_* calls are recognized. */
>>>> +
>>>> +/* { dg-do run } */
>>>> +
>>>> +int
>>>> +fn1 ()
>>>> +{
>>>> + if (__builtin_cpu_supports_cmov () < 0)
>>>> + return -1;
>>>> + if (__builtin_cpu_supports_mmx () < 0)
>>>> + return -1;
>>>> + if (__builtin_cpu_supports_popcount () < 0)
>>>> + return -1;
>>>> + if (__builtin_cpu_supports_sse () < 0)
>>>> + return -1;
>>>> + if (__builtin_cpu_supports_sse2 () < 0)
>>>> + return -1;
>>>> + if (__builtin_cpu_supports_sse3 () < 0)
>>>> + return -1;
>>>> + if (__builtin_cpu_supports_ssse3 () < 0)
>>>> + return -1;
>>>> + if (__builtin_cpu_supports_sse4_1 () < 0)
>>>> + return -1;
>>>> + if (__builtin_cpu_supports_sse4_2 () < 0)
>>>> + return -1;
>>>> + if (__builtin_cpu_is_amd () < 0)
>>>> + return -1;
>>>> + if (__builtin_cpu_is_intel () < 0)
>>>> + return -1;
>>>> + if (__builtin_cpu_is_intel_atom () < 0)
>>>> + return -1;
>>>> + if (__builtin_cpu_is_intel_core2 () < 0)
>>>> + return -1;
>>>> + if (__builtin_cpu_is_intel_corei7 () < 0)
>>>> + return -1;
>>>> + if (__builtin_cpu_is_intel_corei7_nehalem () < 0)
>>>> + return -1;
>>>> + if (__builtin_cpu_is_intel_corei7_westmere () < 0)
>>>> + return -1;
>>>> + if (__builtin_cpu_is_intel_corei7_sandybridge () < 0)
>>>> + return -1;
>>>> + if (__builtin_cpu_is_amdfam10 () < 0)
>>>> + return -1;
>>>> + if (__builtin_cpu_is_amdfam10_barcelona () < 0)
>>>> + return -1;
>>>> + if (__builtin_cpu_is_amdfam10_shanghai () < 0)
>>>> + return -1;
>>>> + if (__builtin_cpu_is_amdfam10_istanbul () < 0)
>>>> + return -1;
>>>> + if (__builtin_cpu_is_amdfam15_bdver1 () < 0)
>>>> + return -1;
>>>> + if (__builtin_cpu_is_amdfam15_bdver2 () < 0)
>>>> + return -1;
>>>> +
>>>> + return 0;
>>>> +}
>>>> +
>>>> +int main ()
>>>> +{
>>>> + return fn1 ();
>>>> +}
>>>>
>>>> --
>>>> This patch is available for review at http://codereview.appspot.com/5754058
Message from matz@suse.de
2012-03-30T12:47:12+00:00matz_suse.deurn:md5:ff7998dacdd51016d3f5c980714e6941
Hi,
On Thu, 29 Mar 2012, Sriraman Tallam wrote:
> +struct __processor_model
> +{
> + /* Vendor. */
> + unsigned int __cpu_is_amd : 1;
> + unsigned int __cpu_is_intel : 1;
> + /* CPU type. */
> + unsigned int __cpu_is_intel_atom : 1;
> + unsigned int __cpu_is_intel_core2 : 1;
> + unsigned int __cpu_is_intel_corei7 : 1;
> + unsigned int __cpu_is_intel_corei7_nehalem : 1;
> + unsigned int __cpu_is_intel_corei7_westmere : 1;
> + unsigned int __cpu_is_intel_corei7_sandybridge : 1;
> + unsigned int __cpu_is_amdfam10h : 1;
> + unsigned int __cpu_is_amdfam10h_barcelona : 1;
> + unsigned int __cpu_is_amdfam10h_shanghai : 1;
> + unsigned int __cpu_is_amdfam10h_istanbul : 1;
> + unsigned int __cpu_is_amdfam15h_bdver1 : 1;
> + unsigned int __cpu_is_amdfam15h_bdver2 : 1;
> +} __cpu_model;
It doesn't make sense for the model to be a bitfield, a processor will
have only ever exactly one model. Just make it an enum or even just an
int.
Ciao,
Michael.
Message from unknown
2012-03-30T22:56:05+00:00Sriramanurn:md5:6e4266843962ec2c34f73354e0015098
Message from tmsriram@google.com
2012-03-30T23:03:00+00:00Sriramanurn:md5:df979d693c68bea44c74a2dbc08a43e3
On Fri, Mar 30, 2012 at 5:47 AM, Michael Matz <matz@suse.de> wrote:
> Hi,
>
> On Thu, 29 Mar 2012, Sriraman Tallam wrote:
>
>> +struct __processor_model
>> +{
>> + /* Vendor. */
>> + unsigned int __cpu_is_amd : 1;
>> + unsigned int __cpu_is_intel : 1;
>> + /* CPU type. */
>> + unsigned int __cpu_is_intel_atom : 1;
>> + unsigned int __cpu_is_intel_core2 : 1;
>> + unsigned int __cpu_is_intel_corei7 : 1;
>> + unsigned int __cpu_is_intel_corei7_nehalem : 1;
>> + unsigned int __cpu_is_intel_corei7_westmere : 1;
>> + unsigned int __cpu_is_intel_corei7_sandybridge : 1;
>> + unsigned int __cpu_is_amdfam10h : 1;
>> + unsigned int __cpu_is_amdfam10h_barcelona : 1;
>> + unsigned int __cpu_is_amdfam10h_shanghai : 1;
>> + unsigned int __cpu_is_amdfam10h_istanbul : 1;
>> + unsigned int __cpu_is_amdfam15h_bdver1 : 1;
>> + unsigned int __cpu_is_amdfam15h_bdver2 : 1;
>> +} __cpu_model;
>
> It doesn't make sense for the model to be a bitfield, a processor will
> have only ever exactly one model. Just make it an enum or even just an
> int.
Not entirely true, nehalem and corei7 can be both set. However, I
modified this by dividing it into types and sub types and then did
what you said.
* config/i386/i386.c (build_processor_features_struct): New function.
(build_processor_model_struct): New function.
(make_var_decl): New function.
(get_field_from_struct): New function.
(fold_builtin_target): New function.
(ix86_fold_builtin): New function.
(ix86_expand_builtin): Expand new builtins by folding them.
(make_cpu_type_builtin): New functions.
(ix86_init_platform_type_builtins): Make the new builtins.
(ix86_init_builtins): Make new builtins to detect CPU type.
(TARGET_FOLD_BUILTIN): New macro.
(IX86_BUILTIN_CPU_INIT): New enum value.
(IX86_BUILTIN_CPU_IS): New enum value.
(IX86_BUILTIN_CPU_SUPPORTS): New enum value.
* config/i386/i386-builtin-types.def: New function type.
* testsuite/gcc.target/builtin_target.c: New testcase.
* libgcc/config/i386/i386-cpuinfo.c: New file.
* libgcc/config/i386/t-cpuinfo: New file.
* libgcc/config.host: Include t-cpuinfo.
* libgcc/config/i386/libgcc-glibc.ver: Version symbols __cpu_model
and __cpu_features.
Patch available for review here:
http://codereview.appspot.com/5754058
Thanks,
-Sri.
>
>
> Ciao,
> Michael.
Message from richard.guenther@gmail.com
2012-04-02T12:38:19+00:00richard.guenther_gmail.comurn:md5:6a014bc71bc02ba249b2aaa198937dfd
On Sat, Mar 31, 2012 at 1:03 AM, Sriraman Tallam <tmsriram@google.com> wrote:
> On Fri, Mar 30, 2012 at 5:47 AM, Michael Matz <matz@suse.de> wrote:
>> Hi,
>>
>> On Thu, 29 Mar 2012, Sriraman Tallam wrote:
>>
>>> +struct __processor_model
>>> +{
>>> + /* Vendor. */
>>> + unsigned int __cpu_is_amd : 1;
>>> + unsigned int __cpu_is_intel : 1;
>>> + /* CPU type. */
>>> + unsigned int __cpu_is_intel_atom : 1;
>>> + unsigned int __cpu_is_intel_core2 : 1;
>>> + unsigned int __cpu_is_intel_corei7 : 1;
>>> + unsigned int __cpu_is_intel_corei7_nehalem : 1;
>>> + unsigned int __cpu_is_intel_corei7_westmere : 1;
>>> + unsigned int __cpu_is_intel_corei7_sandybridge : 1;
>>> + unsigned int __cpu_is_amdfam10h : 1;
>>> + unsigned int __cpu_is_amdfam10h_barcelona : 1;
>>> + unsigned int __cpu_is_amdfam10h_shanghai : 1;
>>> + unsigned int __cpu_is_amdfam10h_istanbul : 1;
>>> + unsigned int __cpu_is_amdfam15h_bdver1 : 1;
>>> + unsigned int __cpu_is_amdfam15h_bdver2 : 1;
>>> +} __cpu_model;
>>
>> It doesn't make sense for the model to be a bitfield, a processor will
>> have only ever exactly one model. Just make it an enum or even just an
>> int.
>
> Not entirely true, nehalem and corei7 can be both set. However, I
> modified this by dividing it into types and sub types and then did
> what you said.
Uh... then I suppose you need to document somewhere what names
match to what cpuid family/model (supposedly thats where your two-layer
hierarchy comes from, which incidentially misses one layer, the vendor?)
Richard.
> * config/i386/i386.c (build_processor_features_struct): New function.
> (build_processor_model_struct): New function.
> (make_var_decl): New function.
> (get_field_from_struct): New function.
> (fold_builtin_target): New function.
> (ix86_fold_builtin): New function.
> (ix86_expand_builtin): Expand new builtins by folding them.
> (make_cpu_type_builtin): New functions.
> (ix86_init_platform_type_builtins): Make the new builtins.
> (ix86_init_builtins): Make new builtins to detect CPU type.
> (TARGET_FOLD_BUILTIN): New macro.
> (IX86_BUILTIN_CPU_INIT): New enum value.
> (IX86_BUILTIN_CPU_IS): New enum value.
> (IX86_BUILTIN_CPU_SUPPORTS): New enum value.
> * config/i386/i386-builtin-types.def: New function type.
> * testsuite/gcc.target/builtin_target.c: New testcase.
>
> * libgcc/config/i386/i386-cpuinfo.c: New file.
> * libgcc/config/i386/t-cpuinfo: New file.
> * libgcc/config.host: Include t-cpuinfo.
> * libgcc/config/i386/libgcc-glibc.ver: Version symbols __cpu_model
> and __cpu_features.
>
> Patch available for review here:
> http://codereview.appspot.com/5754058
>
> Thanks,
> -Sri.
>
>
>>
>>
>> Ciao,
>> Michael.
Message from tmsriram@google.com
2012-04-02T17:59:55+00:00Sriramanurn:md5:218d89a217546e72aa33572b46538033
On Mon, Apr 2, 2012 at 5:38 AM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Sat, Mar 31, 2012 at 1:03 AM, Sriraman Tallam <tmsriram@google.com> wrote:
>> On Fri, Mar 30, 2012 at 5:47 AM, Michael Matz <matz@suse.de> wrote:
>>> Hi,
>>>
>>> On Thu, 29 Mar 2012, Sriraman Tallam wrote:
>>>
>>>> +struct __processor_model
>>>> +{
>>>> + /* Vendor. */
>>>> + unsigned int __cpu_is_amd : 1;
>>>> + unsigned int __cpu_is_intel : 1;
>>>> + /* CPU type. */
>>>> + unsigned int __cpu_is_intel_atom : 1;
>>>> + unsigned int __cpu_is_intel_core2 : 1;
>>>> + unsigned int __cpu_is_intel_corei7 : 1;
>>>> + unsigned int __cpu_is_intel_corei7_nehalem : 1;
>>>> + unsigned int __cpu_is_intel_corei7_westmere : 1;
>>>> + unsigned int __cpu_is_intel_corei7_sandybridge : 1;
>>>> + unsigned int __cpu_is_amdfam10h : 1;
>>>> + unsigned int __cpu_is_amdfam10h_barcelona : 1;
>>>> + unsigned int __cpu_is_amdfam10h_shanghai : 1;
>>>> + unsigned int __cpu_is_amdfam10h_istanbul : 1;
>>>> + unsigned int __cpu_is_amdfam15h_bdver1 : 1;
>>>> + unsigned int __cpu_is_amdfam15h_bdver2 : 1;
>>>> +} __cpu_model;
>>>
>>> It doesn't make sense for the model to be a bitfield, a processor will
>>> have only ever exactly one model. Just make it an enum or even just an
>>> int.
>>
>> Not entirely true, nehalem and corei7 can be both set. However, I
>> modified this by dividing it into types and sub types and then did
>> what you said.
>
> Uh... then I suppose you need to document somewhere what names
> match to what cpuid family/model (supposedly thats where your two-layer
> hierarchy comes from, which incidentially misses one layer, the vendor?)
Yes, it is 3-layer. Vendor, family and model.
What is the right place to document these?
>
> Richard.
>
>> * config/i386/i386.c (build_processor_features_struct): New function.
>> (build_processor_model_struct): New function.
>> (make_var_decl): New function.
>> (get_field_from_struct): New function.
>> (fold_builtin_target): New function.
>> (ix86_fold_builtin): New function.
>> (ix86_expand_builtin): Expand new builtins by folding them.
>> (make_cpu_type_builtin): New functions.
>> (ix86_init_platform_type_builtins): Make the new builtins.
>> (ix86_init_builtins): Make new builtins to detect CPU type.
>> (TARGET_FOLD_BUILTIN): New macro.
>> (IX86_BUILTIN_CPU_INIT): New enum value.
>> (IX86_BUILTIN_CPU_IS): New enum value.
>> (IX86_BUILTIN_CPU_SUPPORTS): New enum value.
>> * config/i386/i386-builtin-types.def: New function type.
>> * testsuite/gcc.target/builtin_target.c: New testcase.
>>
>> * libgcc/config/i386/i386-cpuinfo.c: New file.
>> * libgcc/config/i386/t-cpuinfo: New file.
>> * libgcc/config.host: Include t-cpuinfo.
>> * libgcc/config/i386/libgcc-glibc.ver: Version symbols __cpu_model
>> and __cpu_features.
>>
>> Patch available for review here:
>> http://codereview.appspot.com/5754058
>>
>> Thanks,
>> -Sri.
>>
>>
>>>
>>>
>>> Ciao,
>>> Michael.
Message from unknown
2012-04-03T00:45:58+00:00Sriramanurn:md5:42fd06ef9b1c4ff086e487e6eeacf1d6
Message from tmsriram@google.com
2012-04-03T00:48:04+00:00Sriramanurn:md5:7b74206b6b3c24df65fa71eea94be427
On Mon, Apr 2, 2012 at 5:38 AM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Sat, Mar 31, 2012 at 1:03 AM, Sriraman Tallam <tmsriram@google.com> wrote:
>> On Fri, Mar 30, 2012 at 5:47 AM, Michael Matz <matz@suse.de> wrote:
>>> Hi,
>>>
>>> On Thu, 29 Mar 2012, Sriraman Tallam wrote:
>>>
>>>> +struct __processor_model
>>>> +{
>>>> + /* Vendor. */
>>>> + unsigned int __cpu_is_amd : 1;
>>>> + unsigned int __cpu_is_intel : 1;
>>>> + /* CPU type. */
>>>> + unsigned int __cpu_is_intel_atom : 1;
>>>> + unsigned int __cpu_is_intel_core2 : 1;
>>>> + unsigned int __cpu_is_intel_corei7 : 1;
>>>> + unsigned int __cpu_is_intel_corei7_nehalem : 1;
>>>> + unsigned int __cpu_is_intel_corei7_westmere : 1;
>>>> + unsigned int __cpu_is_intel_corei7_sandybridge : 1;
>>>> + unsigned int __cpu_is_amdfam10h : 1;
>>>> + unsigned int __cpu_is_amdfam10h_barcelona : 1;
>>>> + unsigned int __cpu_is_amdfam10h_shanghai : 1;
>>>> + unsigned int __cpu_is_amdfam10h_istanbul : 1;
>>>> + unsigned int __cpu_is_amdfam15h_bdver1 : 1;
>>>> + unsigned int __cpu_is_amdfam15h_bdver2 : 1;
>>>> +} __cpu_model;
>>>
>>> It doesn't make sense for the model to be a bitfield, a processor will
>>> have only ever exactly one model. Just make it an enum or even just an
>>> int.
>>
>> Not entirely true, nehalem and corei7 can be both set. However, I
>> modified this by dividing it into types and sub types and then did
>> what you said.
>
> Uh... then I suppose you need to document somewhere what names
> match to what cpuid family/model (supposedly thats where your two-layer
> hierarchy comes from, which incidentially misses one layer, the vendor?)
Added documentation to extend.texi
Patch available for review here:
http://codereview.appspot.com/5754058
Thanks,
-Sri.
>
> Richard.
>
>> * config/i386/i386.c (build_processor_features_struct): New function.
>> (build_processor_model_struct): New function.
>> (make_var_decl): New function.
>> (get_field_from_struct): New function.
>> (fold_builtin_target): New function.
>> (ix86_fold_builtin): New function.
>> (ix86_expand_builtin): Expand new builtins by folding them.
>> (make_cpu_type_builtin): New functions.
>> (ix86_init_platform_type_builtins): Make the new builtins.
>> (ix86_init_builtins): Make new builtins to detect CPU type.
>> (TARGET_FOLD_BUILTIN): New macro.
>> (IX86_BUILTIN_CPU_INIT): New enum value.
>> (IX86_BUILTIN_CPU_IS): New enum value.
>> (IX86_BUILTIN_CPU_SUPPORTS): New enum value.
>> * config/i386/i386-builtin-types.def: New function type.
>> * testsuite/gcc.target/builtin_target.c: New testcase.
>>
>> * libgcc/config/i386/i386-cpuinfo.c: New file.
>> * libgcc/config/i386/t-cpuinfo: New file.
>> * libgcc/config.host: Include t-cpuinfo.
>> * libgcc/config/i386/libgcc-glibc.ver: Version symbols __cpu_model
>> and __cpu_features.
>>
>> Patch available for review here:
>> http://codereview.appspot.com/5754058
>>
>> Thanks,
>> -Sri.
>>
>>
>>>
>>>
>>> Ciao,
>>> Michael.
Message from tmsriram@google.com
2012-04-03T19:47:39+00:00Sriramanurn:md5:dacd83e0b0380aae9c6cff81ffb8699a
Hi,
i386 maintainers - Is this patch ok?
Thanks,
-Sri.
On Mon, Apr 2, 2012 at 5:48 PM, Sriraman Tallam <tmsriram@google.com> wrote:
> On Mon, Apr 2, 2012 at 5:38 AM, Richard Guenther
> <richard.guenther@gmail.com> wrote:
>> On Sat, Mar 31, 2012 at 1:03 AM, Sriraman Tallam <tmsriram@google.com> wrote:
>>> On Fri, Mar 30, 2012 at 5:47 AM, Michael Matz <matz@suse.de> wrote:
>>>> Hi,
>>>>
>>>> On Thu, 29 Mar 2012, Sriraman Tallam wrote:
>>>>
>>>>> +struct __processor_model
>>>>> +{
>>>>> + /* Vendor. */
>>>>> + unsigned int __cpu_is_amd : 1;
>>>>> + unsigned int __cpu_is_intel : 1;
>>>>> + /* CPU type. */
>>>>> + unsigned int __cpu_is_intel_atom : 1;
>>>>> + unsigned int __cpu_is_intel_core2 : 1;
>>>>> + unsigned int __cpu_is_intel_corei7 : 1;
>>>>> + unsigned int __cpu_is_intel_corei7_nehalem : 1;
>>>>> + unsigned int __cpu_is_intel_corei7_westmere : 1;
>>>>> + unsigned int __cpu_is_intel_corei7_sandybridge : 1;
>>>>> + unsigned int __cpu_is_amdfam10h : 1;
>>>>> + unsigned int __cpu_is_amdfam10h_barcelona : 1;
>>>>> + unsigned int __cpu_is_amdfam10h_shanghai : 1;
>>>>> + unsigned int __cpu_is_amdfam10h_istanbul : 1;
>>>>> + unsigned int __cpu_is_amdfam15h_bdver1 : 1;
>>>>> + unsigned int __cpu_is_amdfam15h_bdver2 : 1;
>>>>> +} __cpu_model;
>>>>
>>>> It doesn't make sense for the model to be a bitfield, a processor will
>>>> have only ever exactly one model. Just make it an enum or even just an
>>>> int.
>>>
>>> Not entirely true, nehalem and corei7 can be both set. However, I
>>> modified this by dividing it into types and sub types and then did
>>> what you said.
>>
>> Uh... then I suppose you need to document somewhere what names
>> match to what cpuid family/model (supposedly thats where your two-layer
>> hierarchy comes from, which incidentially misses one layer, the vendor?)
>
> Added documentation to extend.texi
>
> Patch available for review here:
> http://codereview.appspot.com/5754058
>
> Thanks,
> -Sri.
>
>
>>
>> Richard.
>>
>>> * config/i386/i386.c (build_processor_features_struct): New function.
>>> (build_processor_model_struct): New function.
>>> (make_var_decl): New function.
>>> (get_field_from_struct): New function.
>>> (fold_builtin_target): New function.
>>> (ix86_fold_builtin): New function.
>>> (ix86_expand_builtin): Expand new builtins by folding them.
>>> (make_cpu_type_builtin): New functions.
>>> (ix86_init_platform_type_builtins): Make the new builtins.
>>> (ix86_init_builtins): Make new builtins to detect CPU type.
>>> (TARGET_FOLD_BUILTIN): New macro.
>>> (IX86_BUILTIN_CPU_INIT): New enum value.
>>> (IX86_BUILTIN_CPU_IS): New enum value.
>>> (IX86_BUILTIN_CPU_SUPPORTS): New enum value.
>>> * config/i386/i386-builtin-types.def: New function type.
>>> * testsuite/gcc.target/builtin_target.c: New testcase.
>>>
>>> * libgcc/config/i386/i386-cpuinfo.c: New file.
>>> * libgcc/config/i386/t-cpuinfo: New file.
>>> * libgcc/config.host: Include t-cpuinfo.
>>> * libgcc/config/i386/libgcc-glibc.ver: Version symbols __cpu_model
>>> and __cpu_features.
>>>
>>> Patch available for review here:
>>> http://codereview.appspot.com/5754058
>>>
>>> Thanks,
>>> -Sri.
>>>
>>>
>>>>
>>>>
>>>> Ciao,
>>>> Michael.
Message from gerald@pfeifer.com
2012-04-08T13:18:02+00:00gerald_pfeifer.comurn:md5:1cef620546539eac2b43433111a72489
On Thu, 29 Mar 2012, Sriraman Tallam wrote:
> Hi,
>
> I have made a new patch to only have two builtins :
>
> * __builtin_cpu_is ("<CPUNAME>")
> * __builtin_cpu_supports ("<FEATURE>")
>
> apart from the cpu init builtin, __builtin_cpu_init.
I don't see any .texi file as part of this change. Shouldn't this
be documented (and also added to the release notes gcc-4.8/changes.html)?
> List of CPU names :
>
> * "amd"
> * "intel"
Are company names really suitable here? Intel is also still producing
ia64 aka Itanium, and in the future AMD might do some ARM-based designs.
AMD64 and Intel64 might work.
Gerald
Message from tmsriram@google.com
2012-04-09T17:18:42+00:00Sriramanurn:md5:6bb0bf706efc020c2b181661614e5dc2
On Sun, Apr 8, 2012 at 6:17 AM, Gerald Pfeifer <gerald@pfeifer.com> wrote:
> On Thu, 29 Mar 2012, Sriraman Tallam wrote:
>> Hi,
>>
>> I have made a new patch to only have two builtins :
>>
>> * __builtin_cpu_is ("<CPUNAME>")
>> * __builtin_cpu_supports ("<FEATURE>")
>>
>> apart from the cpu init builtin, __builtin_cpu_init.
>
> I don't see any .texi file as part of this change. Shouldn't this
> be documented (and also added to the release notes gcc-4.8/changes.html)?
I modified extend.texi for documenting the builtins. I guess you
missed it. I am reattaching the patch just in case.
>
>> List of CPU names :
>>
>> * "amd"
>> * "intel"
>
> Are company names really suitable here? Intel is also still producing
> ia64 aka Itanium, and in the future AMD might do some ARM-based designs.
> AMD64 and Intel64 might work.
This is basically the vendor info from CPUID. I dont mind changing it
but I do not understand why Intel or AMD is unsuitable.
>
> Gerald
Message from tmsriram@google.com
2012-04-12T23:14:52+00:00Sriramanurn:md5:fdb84aab64b7ac8fb78794d84e8ce2b7
Ping.
On Tue, Apr 3, 2012 at 12:47 PM, Sriraman Tallam <tmsriram@google.com> wrote:
> Hi,
>
> i386 maintainers - Is this patch ok?
>
> Thanks,
> -Sri.
>
> On Mon, Apr 2, 2012 at 5:48 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>> On Mon, Apr 2, 2012 at 5:38 AM, Richard Guenther
>> <richard.guenther@gmail.com> wrote:
>>> On Sat, Mar 31, 2012 at 1:03 AM, Sriraman Tallam <tmsriram@google.com> wrote:
>>>> On Fri, Mar 30, 2012 at 5:47 AM, Michael Matz <matz@suse.de> wrote:
>>>>> Hi,
>>>>>
>>>>> On Thu, 29 Mar 2012, Sriraman Tallam wrote:
>>>>>
>>>>>> +struct __processor_model
>>>>>> +{
>>>>>> + /* Vendor. */
>>>>>> + unsigned int __cpu_is_amd : 1;
>>>>>> + unsigned int __cpu_is_intel : 1;
>>>>>> + /* CPU type. */
>>>>>> + unsigned int __cpu_is_intel_atom : 1;
>>>>>> + unsigned int __cpu_is_intel_core2 : 1;
>>>>>> + unsigned int __cpu_is_intel_corei7 : 1;
>>>>>> + unsigned int __cpu_is_intel_corei7_nehalem : 1;
>>>>>> + unsigned int __cpu_is_intel_corei7_westmere : 1;
>>>>>> + unsigned int __cpu_is_intel_corei7_sandybridge : 1;
>>>>>> + unsigned int __cpu_is_amdfam10h : 1;
>>>>>> + unsigned int __cpu_is_amdfam10h_barcelona : 1;
>>>>>> + unsigned int __cpu_is_amdfam10h_shanghai : 1;
>>>>>> + unsigned int __cpu_is_amdfam10h_istanbul : 1;
>>>>>> + unsigned int __cpu_is_amdfam15h_bdver1 : 1;
>>>>>> + unsigned int __cpu_is_amdfam15h_bdver2 : 1;
>>>>>> +} __cpu_model;
>>>>>
>>>>> It doesn't make sense for the model to be a bitfield, a processor will
>>>>> have only ever exactly one model. Just make it an enum or even just an
>>>>> int.
>>>>
>>>> Not entirely true, nehalem and corei7 can be both set. However, I
>>>> modified this by dividing it into types and sub types and then did
>>>> what you said.
>>>
>>> Uh... then I suppose you need to document somewhere what names
>>> match to what cpuid family/model (supposedly thats where your two-layer
>>> hierarchy comes from, which incidentially misses one layer, the vendor?)
>>
>> Added documentation to extend.texi
>>
>> Patch available for review here:
>> http://codereview.appspot.com/5754058
>>
>> Thanks,
>> -Sri.
>>
>>
>>>
>>> Richard.
>>>
>>>> * config/i386/i386.c (build_processor_features_struct): New function.
>>>> (build_processor_model_struct): New function.
>>>> (make_var_decl): New function.
>>>> (get_field_from_struct): New function.
>>>> (fold_builtin_target): New function.
>>>> (ix86_fold_builtin): New function.
>>>> (ix86_expand_builtin): Expand new builtins by folding them.
>>>> (make_cpu_type_builtin): New functions.
>>>> (ix86_init_platform_type_builtins): Make the new builtins.
>>>> (ix86_init_builtins): Make new builtins to detect CPU type.
>>>> (TARGET_FOLD_BUILTIN): New macro.
>>>> (IX86_BUILTIN_CPU_INIT): New enum value.
>>>> (IX86_BUILTIN_CPU_IS): New enum value.
>>>> (IX86_BUILTIN_CPU_SUPPORTS): New enum value.
>>>> * config/i386/i386-builtin-types.def: New function type.
>>>> * testsuite/gcc.target/builtin_target.c: New testcase.
>>>>
>>>> * libgcc/config/i386/i386-cpuinfo.c: New file.
>>>> * libgcc/config/i386/t-cpuinfo: New file.
>>>> * libgcc/config.host: Include t-cpuinfo.
>>>> * libgcc/config/i386/libgcc-glibc.ver: Version symbols __cpu_model
>>>> and __cpu_features.
>>>>
>>>> Patch available for review here:
>>>> http://codereview.appspot.com/5754058
>>>>
>>>> Thanks,
>>>> -Sri.
>>>>
>>>>
>>>>>
>>>>>
>>>>> Ciao,
>>>>> Michael.
Message from tmsriram@google.com
2012-04-18T23:08:34+00:00Sriramanurn:md5:139c04ca0c8d72579aaea48069628995
Ping.
On Thu, Apr 12, 2012 at 4:14 PM, Sriraman Tallam <tmsriram@google.com> wrote:
> Ping.
>
> On Tue, Apr 3, 2012 at 12:47 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>> Hi,
>>
>> i386 maintainers - Is this patch ok?
>>
>> Thanks,
>> -Sri.
>>
>> On Mon, Apr 2, 2012 at 5:48 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>> On Mon, Apr 2, 2012 at 5:38 AM, Richard Guenther
>>> <richard.guenther@gmail.com> wrote:
>>>> On Sat, Mar 31, 2012 at 1:03 AM, Sriraman Tallam <tmsriram@google.com> wrote:
>>>>> On Fri, Mar 30, 2012 at 5:47 AM, Michael Matz <matz@suse.de> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On Thu, 29 Mar 2012, Sriraman Tallam wrote:
>>>>>>
>>>>>>> +struct __processor_model
>>>>>>> +{
>>>>>>> + /* Vendor. */
>>>>>>> + unsigned int __cpu_is_amd : 1;
>>>>>>> + unsigned int __cpu_is_intel : 1;
>>>>>>> + /* CPU type. */
>>>>>>> + unsigned int __cpu_is_intel_atom : 1;
>>>>>>> + unsigned int __cpu_is_intel_core2 : 1;
>>>>>>> + unsigned int __cpu_is_intel_corei7 : 1;
>>>>>>> + unsigned int __cpu_is_intel_corei7_nehalem : 1;
>>>>>>> + unsigned int __cpu_is_intel_corei7_westmere : 1;
>>>>>>> + unsigned int __cpu_is_intel_corei7_sandybridge : 1;
>>>>>>> + unsigned int __cpu_is_amdfam10h : 1;
>>>>>>> + unsigned int __cpu_is_amdfam10h_barcelona : 1;
>>>>>>> + unsigned int __cpu_is_amdfam10h_shanghai : 1;
>>>>>>> + unsigned int __cpu_is_amdfam10h_istanbul : 1;
>>>>>>> + unsigned int __cpu_is_amdfam15h_bdver1 : 1;
>>>>>>> + unsigned int __cpu_is_amdfam15h_bdver2 : 1;
>>>>>>> +} __cpu_model;
>>>>>>
>>>>>> It doesn't make sense for the model to be a bitfield, a processor will
>>>>>> have only ever exactly one model. Just make it an enum or even just an
>>>>>> int.
>>>>>
>>>>> Not entirely true, nehalem and corei7 can be both set. However, I
>>>>> modified this by dividing it into types and sub types and then did
>>>>> what you said.
>>>>
>>>> Uh... then I suppose you need to document somewhere what names
>>>> match to what cpuid family/model (supposedly thats where your two-layer
>>>> hierarchy comes from, which incidentially misses one layer, the vendor?)
>>>
>>> Added documentation to extend.texi
>>>
>>> Patch available for review here:
>>> http://codereview.appspot.com/5754058
>>>
>>> Thanks,
>>> -Sri.
>>>
>>>
>>>>
>>>> Richard.
>>>>
>>>>> * config/i386/i386.c (build_processor_features_struct): New function.
>>>>> (build_processor_model_struct): New function.
>>>>> (make_var_decl): New function.
>>>>> (get_field_from_struct): New function.
>>>>> (fold_builtin_target): New function.
>>>>> (ix86_fold_builtin): New function.
>>>>> (ix86_expand_builtin): Expand new builtins by folding them.
>>>>> (make_cpu_type_builtin): New functions.
>>>>> (ix86_init_platform_type_builtins): Make the new builtins.
>>>>> (ix86_init_builtins): Make new builtins to detect CPU type.
>>>>> (TARGET_FOLD_BUILTIN): New macro.
>>>>> (IX86_BUILTIN_CPU_INIT): New enum value.
>>>>> (IX86_BUILTIN_CPU_IS): New enum value.
>>>>> (IX86_BUILTIN_CPU_SUPPORTS): New enum value.
>>>>> * config/i386/i386-builtin-types.def: New function type.
>>>>> * testsuite/gcc.target/builtin_target.c: New testcase.
>>>>>
>>>>> * libgcc/config/i386/i386-cpuinfo.c: New file.
>>>>> * libgcc/config/i386/t-cpuinfo: New file.
>>>>> * libgcc/config.host: Include t-cpuinfo.
>>>>> * libgcc/config/i386/libgcc-glibc.ver: Version symbols __cpu_model
>>>>> and __cpu_features.
>>>>>
>>>>> Patch available for review here:
>>>>> http://codereview.appspot.com/5754058
>>>>>
>>>>> Thanks,
>>>>> -Sri.
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>> Ciao,
>>>>>> Michael.
Message from ubizjak@gmail.com
2012-04-23T08:19:52+00:00ubizjak_gmail.comurn:md5:b3f4e471ce4672d2ec320a99070f7b0a
On Tue, Apr 3, 2012 at 9:47 PM, Sriraman Tallam <tmsriram@google.com> wrote:
> i386 maintainers - Is this patch ok?
Has the community reached the consensus on how this kind of
functionality has to be implemented? I have followed the discussion a
bit, but IIRC, there was no clear decision. Without this decision, I
am not able to review the _implementation_ of agreed functionality for
x86 target.
(I apologize if I have missed the decision, please point me to the
discussion in this case.)
>>>> Patch available for review here:
>>>> http://codereview.appspot.com/5754058
Please attach patches or inline it in the message itself for a review.
Please see [1] for further instructions.
[1] http://gcc.gnu.org/contribute.html#patches
Uros.
Message from tmsriram@google.com
2012-04-23T16:59:25+00:00Sriramanurn:md5:d4bc5dbae92ae4e3a949b1da8f8edd97
Hi,
On Mon, Apr 23, 2012 at 1:19 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Tue, Apr 3, 2012 at 9:47 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>
>> i386 maintainers - Is this patch ok?
>
> Has the community reached the consensus on how this kind of
> functionality has to be implemented? I have followed the discussion a
> bit, but IIRC, there was no clear decision. Without this decision, I
> am not able to review the _implementation_ of agreed functionality for
> x86 target.
>
> (I apologize if I have missed the decision, please point me to the
> discussion in this case.)
The discussions are here:
http://gcc.gnu.org/ml/gcc-patches/2011-08/msg01446.html
and follow-ups to this.
I am not sure about consensus, but the important points raised were:
1) Constructor ordering: What if some constructors fire before
cpu_indicator_init?, which determines the CPU. I addressed this
problem by making the priority of cpu_indicator_init to be the highest
possible. Still, IFUNC initializers will fire before and they have to
explicitly call __builtin_cpu_init() before checking the CPU type.
2) Reducing the number of builtins : It is only two now.
>
>>>>> Patch available for review here:
>>>>> http://codereview.appspot.com/5754058
>
> Please attach patches or inline it in the message itself for a review.
> Please see [1] for further instructions.
Patch attached, tested on x86_64 and all tests pass.
* config/i386/i386.c (build_processor_features_struct): New function.
(build_processor_model_struct): New function.
(make_var_decl): New function.
(get_field_from_struct): New function.
(fold_builtin_target): New function.
(ix86_fold_builtin): New function.
(ix86_expand_builtin): Expand new builtins by folding them.
(make_cpu_type_builtin): New functions.
(ix86_init_platform_type_builtins): Make the new builtins.
(ix86_init_builtins): Make new builtins to detect CPU type.
(TARGET_FOLD_BUILTIN): New macro.
(IX86_BUILTIN_CPU_INIT): New enum value.
(IX86_BUILTIN_CPU_IS): New enum value.
(IX86_BUILTIN_CPU_SUPPORTS): New enum value.
* config/i386/i386-builtin-types.def: New function type.
* testsuite/gcc.target/builtin_target.c: New testcase.
* doc/extend.texi: Document builtins.
* libgcc/config/i386/i386-cpuinfo.c: New file.
* libgcc/config/i386/t-cpuinfo: New file.
* libgcc/config.host: Include t-cpuinfo.
* libgcc/config/i386/libgcc-glibc.ver: Version symbols __cpu_model
and __cpu_features.
Thanks,
-Sri.
>
> [1] http://gcc.gnu.org/contribute.html#patches
>
> Uros.
Message from tmsriram@google.com
2012-04-23T17:04:36+00:00Sriramanurn:md5:feee1f7a5399153a63f14ea52c6a4ae4
On Mon, Apr 23, 2012 at 1:19 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Tue, Apr 3, 2012 at 9:47 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>
>> i386 maintainers - Is this patch ok?
>
> Has the community reached the consensus on how this kind of
> functionality has to be implemented? I have followed the discussion a
> bit, but IIRC, there was no clear decision. Without this decision, I
> am not able to review the _implementation_ of agreed functionality for
> x86 target.
>
> (I apologize if I have missed the decision, please point me to the
> discussion in this case.)
Also, Richard's ok on just using 2 builtins: __builtin_cpu_is and
__builtin_cpu_supports
http://gcc.gnu.org/ml/gcc-patches/2012-03/msg00752.html
>
>>>>> Patch available for review here:
>>>>> http://codereview.appspot.com/5754058
>
> Please attach patches or inline it in the message itself for a review.
> Please see [1] for further instructions.
>
> [1] http://gcc.gnu.org/contribute.html#patches
>
> Uros.
Message from ubizjak@gmail.com
2012-04-23T19:16:53+00:00ubizjak_gmail.comurn:md5:d7313ea92d58c88e7cf658a8f015866a
On Mon, Apr 23, 2012 at 6:59 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>> i386 maintainers - Is this patch ok?
>>
>> Has the community reached the consensus on how this kind of
>> functionality has to be implemented? I have followed the discussion a
>> bit, but IIRC, there was no clear decision. Without this decision, I
>> am not able to review the _implementation_ of agreed functionality for
>> x86 target.
>>
>> (I apologize if I have missed the decision, please point me to the
>> discussion in this case.)
>
> The discussions are here:
>
> http://gcc.gnu.org/ml/gcc-patches/2011-08/msg01446.html
> and follow-ups to this.
>
> I am not sure about consensus, but the important points raised were:
>
> 1) Constructor ordering: What if some constructors fire before
> cpu_indicator_init?, which determines the CPU. I addressed this
> problem by making the priority of cpu_indicator_init to be the highest
> possible. Still, IFUNC initializers will fire before and they have to
> explicitly call __builtin_cpu_init() before checking the CPU type.
> 2) Reducing the number of builtins : It is only two now.
> * config/i386/i386.c (build_processor_features_struct): New function.
> (build_processor_model_struct): New function.
> (make_var_decl): New function.
> (get_field_from_struct): New function.
> (fold_builtin_target): New function.
> (ix86_fold_builtin): New function.
> (ix86_expand_builtin): Expand new builtins by folding them.
> (make_cpu_type_builtin): New functions.
> (ix86_init_platform_type_builtins): Make the new builtins.
> (ix86_init_builtins): Make new builtins to detect CPU type.
> (TARGET_FOLD_BUILTIN): New macro.
> (IX86_BUILTIN_CPU_INIT): New enum value.
> (IX86_BUILTIN_CPU_IS): New enum value.
> (IX86_BUILTIN_CPU_SUPPORTS): New enum value.
> * config/i386/i386-builtin-types.def: New function type.
> * testsuite/gcc.target/builtin_target.c: New testcase.
> * doc/extend.texi: Document builtins.
>
> * libgcc/config/i386/i386-cpuinfo.c: New file.
> * libgcc/config/i386/t-cpuinfo: New file.
> * libgcc/config.host: Include t-cpuinfo.
> * libgcc/config/i386/libgcc-glibc.ver: Version symbols __cpu_model
> and __cpu_features.
The patch is OK.
I guess that AVX is left as an exercise for a x86 maintainer ;)
(I have also CC'd H.J. for his opinion.)
Thanks,
Uros.
Message from ubizjak@gmail.com
2012-04-23T19:30:54+00:00ubizjak_gmail.comurn:md5:7f2320d5ee120b8e69ccc2d6b8308d4a
On Mon, Apr 23, 2012 at 6:59 PM, Sriraman Tallam <tmsriram@google.com> wrote:
> * config/i386/i386.c (build_processor_features_struct): New function.
> (build_processor_model_struct): New function.
> (make_var_decl): New function.
> (get_field_from_struct): New function.
> (fold_builtin_target): New function.
> (ix86_fold_builtin): New function.
> (ix86_expand_builtin): Expand new builtins by folding them.
> (make_cpu_type_builtin): New functions.
> (ix86_init_platform_type_builtins): Make the new builtins.
> (ix86_init_builtins): Make new builtins to detect CPU type.
> (TARGET_FOLD_BUILTIN): New macro.
> (IX86_BUILTIN_CPU_INIT): New enum value.
> (IX86_BUILTIN_CPU_IS): New enum value.
> (IX86_BUILTIN_CPU_SUPPORTS): New enum value.
> * config/i386/i386-builtin-types.def: New function type.
> * testsuite/gcc.target/builtin_target.c: New testcase.
> * doc/extend.texi: Document builtins.
>
> * libgcc/config/i386/i386-cpuinfo.c: New file.
> * libgcc/config/i386/t-cpuinfo: New file.
> * libgcc/config.host: Include t-cpuinfo.
> * libgcc/config/i386/libgcc-glibc.ver: Version symbols __cpu_model
> and __cpu_features.
A couple of typos in the documentation:
+@deftypefn {Built-in Function} void __builtin_cpu_init (void)
+This function runs the CPU detection code to check the type of CPU
and the features
+supported. This builtin needs to be invoked along with the builtins
to check CPU type
+and features, @code{__builtin_cpu_is} and
@code{__builtin_cpu_supports}, only when used
+used in a function that will be executed before any constructors are
called. The
used used
+@item bdver1
+AMD family 15h Bulldozer version 1.
+
+@item bdver2
+AMD family 15h Bulldozer version 1.
Ehm ....
+@deftypefn {Built-in Function} int __builtin_cpu_supports (const char
*@var{feature})
+This function returns @code{1}, if the runtime cpu supports @var{feature}
+ and returns @code{0} otherwise. The following features can be detected:
In the testcases, return value of the builtin is checked for < 0. If
this signals an error, then please mention negative return and its
meaning.
+@deftypefn {Built-in Function} int __builtin_cpu_is (const char *@var{cpuname})
+This function returns @code{1}, if the runtime cpu is of type @var{cpuname}
+ and returns @code{0} otherwise. The following cpu names can be detected:
Also here.
I also think that this is very useful feature and deserves an entry in
gcc-4.8 changes at http://gcc.gnu.org/gcc-4.8/changes.html .
Uros.
Message from hjl.tools@gmail.com
2012-04-23T20:41:24+00:00hjl.tools_gmail.comurn:md5:2111b01acce69497927a0398473cf37d
On Mon, Apr 23, 2012 at 9:59 AM, Sriraman Tallam <tmsriram@google.com> wrote:
> Hi,
>
> On Mon, Apr 23, 2012 at 1:19 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
>> On Tue, Apr 3, 2012 at 9:47 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>
>>> i386 maintainers - Is this patch ok?
>>
>> Has the community reached the consensus on how this kind of
>> functionality has to be implemented? I have followed the discussion a
>> bit, but IIRC, there was no clear decision. Without this decision, I
>> am not able to review the _implementation_ of agreed functionality for
>> x86 target.
>>
>> (I apologize if I have missed the decision, please point me to the
>> discussion in this case.)
>
> The discussions are here:
>
> http://gcc.gnu.org/ml/gcc-patches/2011-08/msg01446.html
> and follow-ups to this.
>
> I am not sure about consensus, but the important points raised were:
>
> 1) Constructor ordering: What if some constructors fire before
> cpu_indicator_init?, which determines the CPU. I addressed this
> problem by making the priority of cpu_indicator_init to be the highest
> possible. Still, IFUNC initializers will fire before and they have to
> explicitly call __builtin_cpu_init() before checking the CPU type.
> 2) Reducing the number of builtins : It is only two now.
>
>
>>
>>>>>> Patch available for review here:
>>>>>> http://codereview.appspot.com/5754058
>>
>> Please attach patches or inline it in the message itself for a review.
>> Please see [1] for further instructions.
>
> Patch attached, tested on x86_64 and all tests pass.
>
>
> * config/i386/i386.c (build_processor_features_struct): New function.
> (build_processor_model_struct): New function.
> (make_var_decl): New function.
> (get_field_from_struct): New function.
> (fold_builtin_target): New function.
> (ix86_fold_builtin): New function.
> (ix86_expand_builtin): Expand new builtins by folding them.
> (make_cpu_type_builtin): New functions.
> (ix86_init_platform_type_builtins): Make the new builtins.
> (ix86_init_builtins): Make new builtins to detect CPU type.
> (TARGET_FOLD_BUILTIN): New macro.
> (IX86_BUILTIN_CPU_INIT): New enum value.
> (IX86_BUILTIN_CPU_IS): New enum value.
> (IX86_BUILTIN_CPU_SUPPORTS): New enum value.
> * config/i386/i386-builtin-types.def: New function type.
> * testsuite/gcc.target/builtin_target.c: New testcase.
> * doc/extend.texi: Document builtins.
>
> * libgcc/config/i386/i386-cpuinfo.c: New file.
> * libgcc/config/i386/t-cpuinfo: New file.
> * libgcc/config.host: Include t-cpuinfo.
> * libgcc/config/i386/libgcc-glibc.ver: Version symbols __cpu_model
> and __cpu_features.
>
> Thanks,
> -Sri.
>
>
I have 2 comments:
1. You should remove
static int called = 0;
if (called)
return 0;
else
called = 1;
Instead, you can just do
if (_cpu_model.__cpu_vendor)
return 0;
2. You can replace
if (vendor == SIG_AMD)
with
else if (vendor == SIG_AMD)
Thanks.
--
H.J.
Message from tmsriram@google.com
2012-04-23T20:43:21+00:00Sriramanurn:md5:62b3f83edd014caf3e23c5975037b727
On Mon, Apr 23, 2012 at 12:16 PM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Mon, Apr 23, 2012 at 6:59 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>
>>>> i386 maintainers - Is this patch ok?
>>>
>>> Has the community reached the consensus on how this kind of
>>> functionality has to be implemented? I have followed the discussion a
>>> bit, but IIRC, there was no clear decision. Without this decision, I
>>> am not able to review the _implementation_ of agreed functionality for
>>> x86 target.
>>>
>>> (I apologize if I have missed the decision, please point me to the
>>> discussion in this case.)
>>
>> The discussions are here:
>>
>> http://gcc.gnu.org/ml/gcc-patches/2011-08/msg01446.html
>> and follow-ups to this.
>>
>> I am not sure about consensus, but the important points raised were:
>>
>> 1) Constructor ordering: What if some constructors fire before
>> cpu_indicator_init?, which determines the CPU. I addressed this
>> problem by making the priority of cpu_indicator_init to be the highest
>> possible. Still, IFUNC initializers will fire before and they have to
>> explicitly call __builtin_cpu_init() before checking the CPU type.
>> 2) Reducing the number of builtins : It is only two now.
>
>> * config/i386/i386.c (build_processor_features_struct): New function.
>> (build_processor_model_struct): New function.
>> (make_var_decl): New function.
>> (get_field_from_struct): New function.
>> (fold_builtin_target): New function.
>> (ix86_fold_builtin): New function.
>> (ix86_expand_builtin): Expand new builtins by folding them.
>> (make_cpu_type_builtin): New functions.
>> (ix86_init_platform_type_builtins): Make the new builtins.
>> (ix86_init_builtins): Make new builtins to detect CPU type.
>> (TARGET_FOLD_BUILTIN): New macro.
>> (IX86_BUILTIN_CPU_INIT): New enum value.
>> (IX86_BUILTIN_CPU_IS): New enum value.
>> (IX86_BUILTIN_CPU_SUPPORTS): New enum value.
>> * config/i386/i386-builtin-types.def: New function type.
>> * testsuite/gcc.target/builtin_target.c: New testcase.
>> * doc/extend.texi: Document builtins.
>>
>> * libgcc/config/i386/i386-cpuinfo.c: New file.
>> * libgcc/config/i386/t-cpuinfo: New file.
>> * libgcc/config.host: Include t-cpuinfo.
>> * libgcc/config/i386/libgcc-glibc.ver: Version symbols __cpu_model
>> and __cpu_features.
>
> The patch is OK.
>
> I guess that AVX is left as an exercise for a x86 maintainer ;)
I will add AVX, and make all the changes.
Thanks,
-Sri.
>
> (I have also CC'd H.J. for his opinion.)
>
> Thanks,
> Uros.
Message from hjl.tools@gmail.com
2012-04-24T00:46:07+00:00hjl.tools_gmail.comurn:md5:15caf0c7764545d472fd146153ed46fd
On Mon, Apr 23, 2012 at 1:43 PM, Sriraman Tallam <tmsriram@google.com> wrote:
> On Mon, Apr 23, 2012 at 12:16 PM, Uros Bizjak <ubizjak@gmail.com> wrote:
>> On Mon, Apr 23, 2012 at 6:59 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>
>>>>> i386 maintainers - Is this patch ok?
>>>>
>>>> Has the community reached the consensus on how this kind of
>>>> functionality has to be implemented? I have followed the discussion a
>>>> bit, but IIRC, there was no clear decision. Without this decision, I
>>>> am not able to review the _implementation_ of agreed functionality for
>>>> x86 target.
>>>>
>>>> (I apologize if I have missed the decision, please point me to the
>>>> discussion in this case.)
>>>
>>> The discussions are here:
>>>
>>> http://gcc.gnu.org/ml/gcc-patches/2011-08/msg01446.html
>>> and follow-ups to this.
>>>
>>> I am not sure about consensus, but the important points raised were:
>>>
>>> 1) Constructor ordering: What if some constructors fire before
>>> cpu_indicator_init?, which determines the CPU. I addressed this
>>> problem by making the priority of cpu_indicator_init to be the highest
>>> possible. Still, IFUNC initializers will fire before and they have to
>>> explicitly call __builtin_cpu_init() before checking the CPU type.
>>> 2) Reducing the number of builtins : It is only two now.
>>
>>> * config/i386/i386.c (build_processor_features_struct): New function.
>>> (build_processor_model_struct): New function.
>>> (make_var_decl): New function.
>>> (get_field_from_struct): New function.
>>> (fold_builtin_target): New function.
>>> (ix86_fold_builtin): New function.
>>> (ix86_expand_builtin): Expand new builtins by folding them.
>>> (make_cpu_type_builtin): New functions.
>>> (ix86_init_platform_type_builtins): Make the new builtins.
>>> (ix86_init_builtins): Make new builtins to detect CPU type.
>>> (TARGET_FOLD_BUILTIN): New macro.
>>> (IX86_BUILTIN_CPU_INIT): New enum value.
>>> (IX86_BUILTIN_CPU_IS): New enum value.
>>> (IX86_BUILTIN_CPU_SUPPORTS): New enum value.
>>> * config/i386/i386-builtin-types.def: New function type.
>>> * testsuite/gcc.target/builtin_target.c: New testcase.
>>> * doc/extend.texi: Document builtins.
>>>
>>> * libgcc/config/i386/i386-cpuinfo.c: New file.
>>> * libgcc/config/i386/t-cpuinfo: New file.
>>> * libgcc/config.host: Include t-cpuinfo.
>>> * libgcc/config/i386/libgcc-glibc.ver: Version symbols __cpu_model
>>> and __cpu_features.
>>
>> The patch is OK.
>>
>> I guess that AVX is left as an exercise for a x86 maintainer ;)
>
> I will add AVX, and make all the changes.
>
3 more comments:
1. Your implementation isn't thread-safe. With my suggestion of
if (_cpu_model.__cpu_vendor)
return 0;
you should change get_intel_cpu to return processor_vendor and
set _cpu_model.__cpu_vendor after setting up all other bits.
2. Bitfields
struct __processor_features
+{
+ unsigned int __cpu_cmov : 1;
+ unsigned int __cpu_mmx : 1;
+ unsigned int __cpu_popcnt : 1;
+ unsigned int __cpu_sse : 1;
+ unsigned int __cpu_sse2 : 1;
+ unsigned int __cpu_sse3 : 1;
+ unsigned int __cpu_ssse3 : 1;
+ unsigned int __cpu_sse4_1 : 1;
+ unsigned int __cpu_sse4_2 : 1;
+} __cpu_features;
may not be thread-safe since only char/short/int/long long
stores are atomic and it isn't extensible. You can use
unsigned int __cpu_features[1];
to make it extensible and store atomic.
3. You may move
unsigned int __cpu_features[1];
into struct __processor_model to only export one symbol
instead of 2.
--
H.J.
Message from tmsriram@google.com
2012-04-25T00:10:04+00:00Sriramanurn:md5:5ddc8edc3258c585dffc347f1d69ac59
Hi,
Thanks for all the comments. I have made all the changes as
mentioned and submiited the patch. Summary of changes made:
* Add support for AVX
* Fix documentation in extend.texi
* Make it thread-safe according to H.J.'s comments.
I have attached the patch. Boot-strapped and checked for test parity
with pristine build.
* config/i386/i386.c (build_processor_model_struct): New function.
(make_var_decl): New function.
(fold_builtin_cpu): New function.
(ix86_fold_builtin): New function.
(make_cpu_type_builtin): New function.
(ix86_init_platform_type_builtins): New function.
(ix86_expand_builtin): Expand new builtins by folding them.
(ix86_init_builtins): Make new builtins to detect CPU type.
(TARGET_FOLD_BUILTIN): New macro.
(IX86_BUILTIN_CPU_INIT): New enum value.
(IX86_BUILTIN_CPU_IS): New enum value.
(IX86_BUILTIN_CPU_SUPPORTS): New enum value.
* config/i386/i386-builtin-types.def: New function type.
* testsuite/gcc.target/builtin_target.c: New testcase.
* doc/extend.texi: Document builtins.
* libgcc/config/i386/i386-cpuinfo.c: New file.
* libgcc/config/i386/t-cpuinfo: New file.
* libgcc/config.host: Include t-cpuinfo.
* libgcc/config/i386/libgcc-glibc.ver: Version symbol __cpu_model.
Thanks,
-Sri.
On Mon, Apr 23, 2012 at 5:46 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Mon, Apr 23, 2012 at 1:43 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>> On Mon, Apr 23, 2012 at 12:16 PM, Uros Bizjak <ubizjak@gmail.com> wrote:
>>> On Mon, Apr 23, 2012 at 6:59 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>>
>>>>>> i386 maintainers - Is this patch ok?
>>>>>
>>>>> Has the community reached the consensus on how this kind of
>>>>> functionality has to be implemented? I have followed the discussion a
>>>>> bit, but IIRC, there was no clear decision. Without this decision, I
>>>>> am not able to review the _implementation_ of agreed functionality for
>>>>> x86 target.
>>>>>
>>>>> (I apologize if I have missed the decision, please point me to the
>>>>> discussion in this case.)
>>>>
>>>> The discussions are here:
>>>>
>>>> http://gcc.gnu.org/ml/gcc-patches/2011-08/msg01446.html
>>>> and follow-ups to this.
>>>>
>>>> I am not sure about consensus, but the important points raised were:
>>>>
>>>> 1) Constructor ordering: What if some constructors fire before
>>>> cpu_indicator_init?, which determines the CPU. I addressed this
>>>> problem by making the priority of cpu_indicator_init to be the highest
>>>> possible. Still, IFUNC initializers will fire before and they have to
>>>> explicitly call __builtin_cpu_init() before checking the CPU type.
>>>> 2) Reducing the number of builtins : It is only two now.
>>>
>>>> * config/i386/i386.c (build_processor_features_struct): New function.
>>>> (build_processor_model_struct): New function.
>>>> (make_var_decl): New function.
>>>> (get_field_from_struct): New function.
>>>> (fold_builtin_target): New function.
>>>> (ix86_fold_builtin): New function.
>>>> (ix86_expand_builtin): Expand new builtins by folding them.
>>>> (make_cpu_type_builtin): New functions.
>>>> (ix86_init_platform_type_builtins): Make the new builtins.
>>>> (ix86_init_builtins): Make new builtins to detect CPU type.
>>>> (TARGET_FOLD_BUILTIN): New macro.
>>>> (IX86_BUILTIN_CPU_INIT): New enum value.
>>>> (IX86_BUILTIN_CPU_IS): New enum value.
>>>> (IX86_BUILTIN_CPU_SUPPORTS): New enum value.
>>>> * config/i386/i386-builtin-types.def: New function type.
>>>> * testsuite/gcc.target/builtin_target.c: New testcase.
>>>> * doc/extend.texi: Document builtins.
>>>>
>>>> * libgcc/config/i386/i386-cpuinfo.c: New file.
>>>> * libgcc/config/i386/t-cpuinfo: New file.
>>>> * libgcc/config.host: Include t-cpuinfo.
>>>> * libgcc/config/i386/libgcc-glibc.ver: Version symbols __cpu_model
>>>> and __cpu_features.
>>>
>>> The patch is OK.
>>>
>>> I guess that AVX is left as an exercise for a x86 maintainer ;)
>>
>> I will add AVX, and make all the changes.
>>
>
> 3 more comments:
>
> 1. Your implementation isn't thread-safe. With my suggestion of
>
> if (_cpu_model.__cpu_vendor)
> return 0;
>
> you should change get_intel_cpu to return processor_vendor and
> set _cpu_model.__cpu_vendor after setting up all other bits.
>
> 2. Bitfields
>
> struct __processor_features
> +{
> + unsigned int __cpu_cmov : 1;
> + unsigned int __cpu_mmx : 1;
> + unsigned int __cpu_popcnt : 1;
> + unsigned int __cpu_sse : 1;
> + unsigned int __cpu_sse2 : 1;
> + unsigned int __cpu_sse3 : 1;
> + unsigned int __cpu_ssse3 : 1;
> + unsigned int __cpu_sse4_1 : 1;
> + unsigned int __cpu_sse4_2 : 1;
> +} __cpu_features;
>
> may not be thread-safe since only char/short/int/long long
> stores are atomic and it isn't extensible. You can use
>
> unsigned int __cpu_features[1];
>
> to make it extensible and store atomic.
>
> 3. You may move
>
> unsigned int __cpu_features[1];
>
> into struct __processor_model to only export one symbol
> instead of 2.
>
> --
> H.J.
Message from hjl.tools@gmail.com
2012-04-25T00:24:08+00:00hjl.tools_gmail.comurn:md5:526226211f8f3278d8cabb724d6dde0f
On Tue, Apr 24, 2012 at 5:10 PM, Sriraman Tallam <tmsriram@google.com> wrote:
> Hi,
>
> Thanks for all the comments. I have made all the changes as
> mentioned and submiited the patch. Summary of changes made:
>
> * Add support for AVX
> * Fix documentation in extend.texi
> * Make it thread-safe according to H.J.'s comments.
>
> I have attached the patch. Boot-strapped and checked for test parity
> with pristine build.
>
> * config/i386/i386.c (build_processor_model_struct): New function.
> (make_var_decl): New function.
> (fold_builtin_cpu): New function.
> (ix86_fold_builtin): New function.
> (make_cpu_type_builtin): New function.
> (ix86_init_platform_type_builtins): New function.
> (ix86_expand_builtin): Expand new builtins by folding them.
> (ix86_init_builtins): Make new builtins to detect CPU type.
> (TARGET_FOLD_BUILTIN): New macro.
> (IX86_BUILTIN_CPU_INIT): New enum value.
> (IX86_BUILTIN_CPU_IS): New enum value.
> (IX86_BUILTIN_CPU_SUPPORTS): New enum value.
> * config/i386/i386-builtin-types.def: New function type.
> * testsuite/gcc.target/builtin_target.c: New testcase.
> * doc/extend.texi: Document builtins.
>
> * libgcc/config/i386/i386-cpuinfo.c: New file.
> * libgcc/config/i386/t-cpuinfo: New file.
> * libgcc/config.host: Include t-cpuinfo.
> * libgcc/config/i386/libgcc-glibc.ver: Version symbol __cpu_model.
>
>
+ /* This function needs to run just once. */
+ if (__cpu_model.__cpu_vendor)
+ return 0;
+
+ /* Assume cpuid insn present. Run in level 0 to get vendor id. */
+ if (!__get_cpuid_output (0, &eax, &ebx, &ecx, &edx))
+ return -1;
If __get_cpuid_output returns non-zero, it will be called
repeatedly. I think you should set __cpu_model.__cpu_vendor
to non-zero in this case.
Otherwise, it looks good to me.
Thanks.
--
H.J.
Message from tmsriram@google.com
2012-04-25T02:06:45+00:00Sriramanurn:md5:c3a5182e0f892d2657d403ee6a413402
On Tue, Apr 24, 2012 at 5:24 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Tue, Apr 24, 2012 at 5:10 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>> Hi,
>>
>> Thanks for all the comments. I have made all the changes as
>> mentioned and submiited the patch. Summary of changes made:
>>
>> * Add support for AVX
>> * Fix documentation in extend.texi
>> * Make it thread-safe according to H.J.'s comments.
>>
>> I have attached the patch. Boot-strapped and checked for test parity
>> with pristine build.
>>
>> * config/i386/i386.c (build_processor_model_struct): New function.
>> (make_var_decl): New function.
>> (fold_builtin_cpu): New function.
>> (ix86_fold_builtin): New function.
>> (make_cpu_type_builtin): New function.
>> (ix86_init_platform_type_builtins): New function.
>> (ix86_expand_builtin): Expand new builtins by folding them.
>> (ix86_init_builtins): Make new builtins to detect CPU type.
>> (TARGET_FOLD_BUILTIN): New macro.
>> (IX86_BUILTIN_CPU_INIT): New enum value.
>> (IX86_BUILTIN_CPU_IS): New enum value.
>> (IX86_BUILTIN_CPU_SUPPORTS): New enum value.
>> * config/i386/i386-builtin-types.def: New function type.
>> * testsuite/gcc.target/builtin_target.c: New testcase.
>> * doc/extend.texi: Document builtins.
>>
>> * libgcc/config/i386/i386-cpuinfo.c: New file.
>> * libgcc/config/i386/t-cpuinfo: New file.
>> * libgcc/config.host: Include t-cpuinfo.
>> * libgcc/config/i386/libgcc-glibc.ver: Version symbol __cpu_model.
>>
>>
>
> + /* This function needs to run just once. */
> + if (__cpu_model.__cpu_vendor)
> + return 0;
> +
> + /* Assume cpuid insn present. Run in level 0 to get vendor id. */
> + if (!__get_cpuid_output (0, &eax, &ebx, &ecx, &edx))
> + return -1;
>
> If __get_cpuid_output returns non-zero, it will be called
> repeatedly. I think you should set __cpu_model.__cpu_vendor
> to non-zero in this case.
Done now.
2012-04-24 Sriraman Tallam <tmsriram@google.com>
* libgcc/config/i386/i386-cpuinfo.c: Set __cpu_vendor always.
Index: libgcc/config/i386/i386-cpuinfo.c
===================================================================
--- libgcc/config/i386/i386-cpuinfo.c (revision 186789)
+++ libgcc/config/i386/i386-cpuinfo.c (working copy)
@@ -256,16 +256,25 @@ __cpu_indicator_init (void)
/* Assume cpuid insn present. Run in level 0 to get vendor id. */
if (!__get_cpuid_output (0, &eax, &ebx, &ecx, &edx))
- return -1;
+ {
+ __cpu_model.__cpu_vendor = VENDOR_OTHER;
+ return -1;
+ }
vendor = ebx;
max_level = eax;
if (max_level < 1)
- return -1;
+ {
+ __cpu_model.__cpu_vendor = VENDOR_OTHER;
+ return -1;
+ }
if (!__get_cpuid_output (1, &eax, &ebx, &ecx, &edx))
- return -1;
+ {
+ __cpu_model.__cpu_vendor = VENDOR_OTHER;
+ return -1;
+ }
model = (eax >> 4) & 0x0f;
family = (eax >> 8) & 0x0f;
Thanks,
-Sri.
>
> Otherwise, it looks good to me.
>
> Thanks.
>
>
> --
> H.J.
Message from hjl.tools@gmail.com
2012-04-25T02:39:14+00:00hjl.tools_gmail.comurn:md5:00dff10d03cdbcf82613fcbafdce3697
On Tue, Apr 24, 2012 at 7:06 PM, Sriraman Tallam <tmsriram@google.com> wrote:
> On Tue, Apr 24, 2012 at 5:24 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Tue, Apr 24, 2012 at 5:10 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>> Hi,
>>>
>>> Thanks for all the comments. I have made all the changes as
>>> mentioned and submiited the patch. Summary of changes made:
>>>
>>> * Add support for AVX
>>> * Fix documentation in extend.texi
>>> * Make it thread-safe according to H.J.'s comments.
>>>
>>> I have attached the patch. Boot-strapped and checked for test parity
>>> with pristine build.
>>>
>>> * config/i386/i386.c (build_processor_model_struct): New function.
>>> (make_var_decl): New function.
>>> (fold_builtin_cpu): New function.
>>> (ix86_fold_builtin): New function.
>>> (make_cpu_type_builtin): New function.
>>> (ix86_init_platform_type_builtins): New function.
>>> (ix86_expand_builtin): Expand new builtins by folding them.
>>> (ix86_init_builtins): Make new builtins to detect CPU type.
>>> (TARGET_FOLD_BUILTIN): New macro.
>>> (IX86_BUILTIN_CPU_INIT): New enum value.
>>> (IX86_BUILTIN_CPU_IS): New enum value.
>>> (IX86_BUILTIN_CPU_SUPPORTS): New enum value.
>>> * config/i386/i386-builtin-types.def: New function type.
>>> * testsuite/gcc.target/builtin_target.c: New testcase.
>>> * doc/extend.texi: Document builtins.
>>>
>>> * libgcc/config/i386/i386-cpuinfo.c: New file.
>>> * libgcc/config/i386/t-cpuinfo: New file.
>>> * libgcc/config.host: Include t-cpuinfo.
>>> * libgcc/config/i386/libgcc-glibc.ver: Version symbol __cpu_model.
>>>
>>>
>>
>> + /* This function needs to run just once. */
>> + if (__cpu_model.__cpu_vendor)
>> + return 0;
>> +
>> + /* Assume cpuid insn present. Run in level 0 to get vendor id. */
>> + if (!__get_cpuid_output (0, &eax, &ebx, &ecx, &edx))
>> + return -1;
>>
>> If __get_cpuid_output returns non-zero, it will be called
>> repeatedly. I think you should set __cpu_model.__cpu_vendor
>> to non-zero in this case.
>
> Done now.
>
> 2012-04-24 Sriraman Tallam <tmsriram@google.com>
>
> * libgcc/config/i386/i386-cpuinfo.c: Set __cpu_vendor always.
>
>
> Index: libgcc/config/i386/i386-cpuinfo.c
> ===================================================================
> --- libgcc/config/i386/i386-cpuinfo.c (revision 186789)
> +++ libgcc/config/i386/i386-cpuinfo.c (working copy)
> @@ -256,16 +256,25 @@ __cpu_indicator_init (void)
>
> /* Assume cpuid insn present. Run in level 0 to get vendor id. */
> if (!__get_cpuid_output (0, &eax, &ebx, &ecx, &edx))
> - return -1;
> + {
> + __cpu_model.__cpu_vendor = VENDOR_OTHER;
> + return -1;
> + }
>
> vendor = ebx;
> max_level = eax;
>
> if (max_level < 1)
> - return -1;
> + {
> + __cpu_model.__cpu_vendor = VENDOR_OTHER;
> + return -1;
> + }
>
> if (!__get_cpuid_output (1, &eax, &ebx, &ecx, &edx))
> - return -1;
> + {
> + __cpu_model.__cpu_vendor = VENDOR_OTHER;
> + return -1;
> + }
>
> model = (eax >> 4) & 0x0f;
> family = (eax >> 8) & 0x0f;
>
>
> Thanks,
Should you also handle AVX2?
--
H.J.
Message from tmsriram@google.com
2012-04-25T21:25:08+00:00Sriramanurn:md5:551070b3d8f0825195857022b8f9893b
On Tue, Apr 24, 2012 at 7:39 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Tue, Apr 24, 2012 at 7:06 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>> On Tue, Apr 24, 2012 at 5:24 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>> On Tue, Apr 24, 2012 at 5:10 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>>> Hi,
>>>>
>>>> Thanks for all the comments. I have made all the changes as
>>>> mentioned and submiited the patch. Summary of changes made:
>>>>
>>>> * Add support for AVX
>>>> * Fix documentation in extend.texi
>>>> * Make it thread-safe according to H.J.'s comments.
>>>>
>>>> I have attached the patch. Boot-strapped and checked for test parity
>>>> with pristine build.
>>>>
>>>> * config/i386/i386.c (build_processor_model_struct): New function.
>>>> (make_var_decl): New function.
>>>> (fold_builtin_cpu): New function.
>>>> (ix86_fold_builtin): New function.
>>>> (make_cpu_type_builtin): New function.
>>>> (ix86_init_platform_type_builtins): New function.
>>>> (ix86_expand_builtin): Expand new builtins by folding them.
>>>> (ix86_init_builtins): Make new builtins to detect CPU type.
>>>> (TARGET_FOLD_BUILTIN): New macro.
>>>> (IX86_BUILTIN_CPU_INIT): New enum value.
>>>> (IX86_BUILTIN_CPU_IS): New enum value.
>>>> (IX86_BUILTIN_CPU_SUPPORTS): New enum value.
>>>> * config/i386/i386-builtin-types.def: New function type.
>>>> * testsuite/gcc.target/builtin_target.c: New testcase.
>>>> * doc/extend.texi: Document builtins.
>>>>
>>>> * libgcc/config/i386/i386-cpuinfo.c: New file.
>>>> * libgcc/config/i386/t-cpuinfo: New file.
>>>> * libgcc/config.host: Include t-cpuinfo.
>>>> * libgcc/config/i386/libgcc-glibc.ver: Version symbol __cpu_model.
>>>>
>>>>
>>>
>>> + /* This function needs to run just once. */
>>> + if (__cpu_model.__cpu_vendor)
>>> + return 0;
>>> +
>>> + /* Assume cpuid insn present. Run in level 0 to get vendor id. */
>>> + if (!__get_cpuid_output (0, &eax, &ebx, &ecx, &edx))
>>> + return -1;
>>>
>>> If __get_cpuid_output returns non-zero, it will be called
>>> repeatedly. I think you should set __cpu_model.__cpu_vendor
>>> to non-zero in this case.
>>
>> Done now.
>>
>> 2012-04-24 Sriraman Tallam <tmsriram@google.com>
>>
>> * libgcc/config/i386/i386-cpuinfo.c: Set __cpu_vendor always.
>>
>>
>> Index: libgcc/config/i386/i386-cpuinfo.c
>> ===================================================================
>> --- libgcc/config/i386/i386-cpuinfo.c (revision 186789)
>> +++ libgcc/config/i386/i386-cpuinfo.c (working copy)
>> @@ -256,16 +256,25 @@ __cpu_indicator_init (void)
>>
>> /* Assume cpuid insn present. Run in level 0 to get vendor id. */
>> if (!__get_cpuid_output (0, &eax, &ebx, &ecx, &edx))
>> - return -1;
>> + {
>> + __cpu_model.__cpu_vendor = VENDOR_OTHER;
>> + return -1;
>> + }
>>
>> vendor = ebx;
>> max_level = eax;
>>
>> if (max_level < 1)
>> - return -1;
>> + {
>> + __cpu_model.__cpu_vendor = VENDOR_OTHER;
>> + return -1;
>> + }
>>
>> if (!__get_cpuid_output (1, &eax, &ebx, &ecx, &edx))
>> - return -1;
>> + {
>> + __cpu_model.__cpu_vendor = VENDOR_OTHER;
>> + return -1;
>> + }
>>
>> model = (eax >> 4) & 0x0f;
>> family = (eax >> 8) & 0x0f;
>>
>>
>> Thanks,
>
> Should you also handle AVX2?
I cannot test it and thought will wait till I get access to a
processor with AVX2.
Thanks,
-Sri.
>
>
> --
> H.J.
Message from hjl.tools@gmail.com
2012-04-25T21:28:37+00:00hjl.tools_gmail.comurn:md5:ac8b39d016f1f0b8ab3d90600bcbc346
On Wed, Apr 25, 2012 at 2:25 PM, Sriraman Tallam <tmsriram@google.com> wrote:
> On Tue, Apr 24, 2012 at 7:39 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Tue, Apr 24, 2012 at 7:06 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>> On Tue, Apr 24, 2012 at 5:24 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>> On Tue, Apr 24, 2012 at 5:10 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>>>> Hi,
>>>>>
>>>>> Thanks for all the comments. I have made all the changes as
>>>>> mentioned and submiited the patch. Summary of changes made:
>>>>>
>>>>> * Add support for AVX
>>>>> * Fix documentation in extend.texi
>>>>> * Make it thread-safe according to H.J.'s comments.
>>>>>
>>>>> I have attached the patch. Boot-strapped and checked for test parity
>>>>> with pristine build.
>>>>>
>>>>> * config/i386/i386.c (build_processor_model_struct): New function.
>>>>> (make_var_decl): New function.
>>>>> (fold_builtin_cpu): New function.
>>>>> (ix86_fold_builtin): New function.
>>>>> (make_cpu_type_builtin): New function.
>>>>> (ix86_init_platform_type_builtins): New function.
>>>>> (ix86_expand_builtin): Expand new builtins by folding them.
>>>>> (ix86_init_builtins): Make new builtins to detect CPU type.
>>>>> (TARGET_FOLD_BUILTIN): New macro.
>>>>> (IX86_BUILTIN_CPU_INIT): New enum value.
>>>>> (IX86_BUILTIN_CPU_IS): New enum value.
>>>>> (IX86_BUILTIN_CPU_SUPPORTS): New enum value.
>>>>> * config/i386/i386-builtin-types.def: New function type.
>>>>> * testsuite/gcc.target/builtin_target.c: New testcase.
>>>>> * doc/extend.texi: Document builtins.
>>>>>
>>>>> * libgcc/config/i386/i386-cpuinfo.c: New file.
>>>>> * libgcc/config/i386/t-cpuinfo: New file.
>>>>> * libgcc/config.host: Include t-cpuinfo.
>>>>> * libgcc/config/i386/libgcc-glibc.ver: Version symbol __cpu_model.
>>>>>
>>>>>
>>>>
>>>> + /* This function needs to run just once. */
>>>> + if (__cpu_model.__cpu_vendor)
>>>> + return 0;
>>>> +
>>>> + /* Assume cpuid insn present. Run in level 0 to get vendor id. */
>>>> + if (!__get_cpuid_output (0, &eax, &ebx, &ecx, &edx))
>>>> + return -1;
>>>>
>>>> If __get_cpuid_output returns non-zero, it will be called
>>>> repeatedly. I think you should set __cpu_model.__cpu_vendor
>>>> to non-zero in this case.
>>>
>>> Done now.
>>>
>>> 2012-04-24 Sriraman Tallam <tmsriram@google.com>
>>>
>>> * libgcc/config/i386/i386-cpuinfo.c: Set __cpu_vendor always.
>>>
>>>
>>> Index: libgcc/config/i386/i386-cpuinfo.c
>>> ===================================================================
>>> --- libgcc/config/i386/i386-cpuinfo.c (revision 186789)
>>> +++ libgcc/config/i386/i386-cpuinfo.c (working copy)
>>> @@ -256,16 +256,25 @@ __cpu_indicator_init (void)
>>>
>>> /* Assume cpuid insn present. Run in level 0 to get vendor id. */
>>> if (!__get_cpuid_output (0, &eax, &ebx, &ecx, &edx))
>>> - return -1;
>>> + {
>>> + __cpu_model.__cpu_vendor = VENDOR_OTHER;
>>> + return -1;
>>> + }
>>>
>>> vendor = ebx;
>>> max_level = eax;
>>>
>>> if (max_level < 1)
>>> - return -1;
>>> + {
>>> + __cpu_model.__cpu_vendor = VENDOR_OTHER;
>>> + return -1;
>>> + }
>>>
>>> if (!__get_cpuid_output (1, &eax, &ebx, &ecx, &edx))
>>> - return -1;
>>> + {
>>> + __cpu_model.__cpu_vendor = VENDOR_OTHER;
>>> + return -1;
>>> + }
>>>
>>> model = (eax >> 4) & 0x0f;
>>> family = (eax >> 8) & 0x0f;
>>>
>>>
>>> Thanks,
>>
>> Should you also handle AVX2?
>
> I cannot test it and thought will wait till I get access to a
> processor with AVX2.
>
You can download an AVX2 emulator (SDE) from
http://software.intel.com/en-us/avx/
to test AVX2 binaries.
--
H.J.
Message from tmsriram@google.com
2012-04-25T21:45:15+00:00Sriramanurn:md5:537806c0926d51226e04dec81535ab89
On Wed, Apr 25, 2012 at 2:28 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Wed, Apr 25, 2012 at 2:25 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>> On Tue, Apr 24, 2012 at 7:39 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>> On Tue, Apr 24, 2012 at 7:06 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>>> On Tue, Apr 24, 2012 at 5:24 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>>> On Tue, Apr 24, 2012 at 5:10 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> Thanks for all the comments. I have made all the changes as
>>>>>> mentioned and submiited the patch. Summary of changes made:
>>>>>>
>>>>>> * Add support for AVX
>>>>>> * Fix documentation in extend.texi
>>>>>> * Make it thread-safe according to H.J.'s comments.
>>>>>>
>>>>>> I have attached the patch. Boot-strapped and checked for test parity
>>>>>> with pristine build.
>>>>>>
>>>>>> * config/i386/i386.c (build_processor_model_struct): New function.
>>>>>> (make_var_decl): New function.
>>>>>> (fold_builtin_cpu): New function.
>>>>>> (ix86_fold_builtin): New function.
>>>>>> (make_cpu_type_builtin): New function.
>>>>>> (ix86_init_platform_type_builtins): New function.
>>>>>> (ix86_expand_builtin): Expand new builtins by folding them.
>>>>>> (ix86_init_builtins): Make new builtins to detect CPU type.
>>>>>> (TARGET_FOLD_BUILTIN): New macro.
>>>>>> (IX86_BUILTIN_CPU_INIT): New enum value.
>>>>>> (IX86_BUILTIN_CPU_IS): New enum value.
>>>>>> (IX86_BUILTIN_CPU_SUPPORTS): New enum value.
>>>>>> * config/i386/i386-builtin-types.def: New function type.
>>>>>> * testsuite/gcc.target/builtin_target.c: New testcase.
>>>>>> * doc/extend.texi: Document builtins.
>>>>>>
>>>>>> * libgcc/config/i386/i386-cpuinfo.c: New file.
>>>>>> * libgcc/config/i386/t-cpuinfo: New file.
>>>>>> * libgcc/config.host: Include t-cpuinfo.
>>>>>> * libgcc/config/i386/libgcc-glibc.ver: Version symbol __cpu_model.
>>>>>>
>>>>>>
>>>>>
>>>>> + /* This function needs to run just once. */
>>>>> + if (__cpu_model.__cpu_vendor)
>>>>> + return 0;
>>>>> +
>>>>> + /* Assume cpuid insn present. Run in level 0 to get vendor id. */
>>>>> + if (!__get_cpuid_output (0, &eax, &ebx, &ecx, &edx))
>>>>> + return -1;
>>>>>
>>>>> If __get_cpuid_output returns non-zero, it will be called
>>>>> repeatedly. I think you should set __cpu_model.__cpu_vendor
>>>>> to non-zero in this case.
>>>>
>>>> Done now.
>>>>
>>>> 2012-04-24 Sriraman Tallam <tmsriram@google.com>
>>>>
>>>> * libgcc/config/i386/i386-cpuinfo.c: Set __cpu_vendor always.
>>>>
>>>>
>>>> Index: libgcc/config/i386/i386-cpuinfo.c
>>>> ===================================================================
>>>> --- libgcc/config/i386/i386-cpuinfo.c (revision 186789)
>>>> +++ libgcc/config/i386/i386-cpuinfo.c (working copy)
>>>> @@ -256,16 +256,25 @@ __cpu_indicator_init (void)
>>>>
>>>> /* Assume cpuid insn present. Run in level 0 to get vendor id. */
>>>> if (!__get_cpuid_output (0, &eax, &ebx, &ecx, &edx))
>>>> - return -1;
>>>> + {
>>>> + __cpu_model.__cpu_vendor = VENDOR_OTHER;
>>>> + return -1;
>>>> + }
>>>>
>>>> vendor = ebx;
>>>> max_level = eax;
>>>>
>>>> if (max_level < 1)
>>>> - return -1;
>>>> + {
>>>> + __cpu_model.__cpu_vendor = VENDOR_OTHER;
>>>> + return -1;
>>>> + }
>>>>
>>>> if (!__get_cpuid_output (1, &eax, &ebx, &ecx, &edx))
>>>> - return -1;
>>>> + {
>>>> + __cpu_model.__cpu_vendor = VENDOR_OTHER;
>>>> + return -1;
>>>> + }
>>>>
>>>> model = (eax >> 4) & 0x0f;
>>>> family = (eax >> 8) & 0x0f;
>>>>
>>>>
>>>> Thanks,
>>>
>>> Should you also handle AVX2?
>>
>> I cannot test it and thought will wait till I get access to a
>> processor with AVX2.
>>
>
> You can download an AVX2 emulator (SDE) from
>
> http://software.intel.com/en-us/avx/
>
> to test AVX2 binaries.
Ok thanks, I will prepare a patch.
-Sri.
>
> --
> H.J.
Message from tmsriram@google.com
2012-04-25T23:38:43+00:00Sriramanurn:md5:de7c983381f03068091fd8f07cb88ca3
Hi H.J,
Could you please review this patch for AVX2 check?
* config/i386/i386-cpuinfo.c (FEATURE_AVX2): New enum value.
(get_available_features): New argument. Check for AVX2.
(__cpu_indicator_init): Modify call to get_available_features
.
* doc/extend.texi: Document avx2 support.
* testsuite/gcc.target/i386/builtin_target.c: Check avx2.
* config/i386/i386.c (fold_builtin_cpu): Add avx2.
Thanks,
-Sri.
On Wed, Apr 25, 2012 at 2:45 PM, Sriraman Tallam <tmsriram@google.com> wrote:
> On Wed, Apr 25, 2012 at 2:28 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Wed, Apr 25, 2012 at 2:25 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>> On Tue, Apr 24, 2012 at 7:39 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>> On Tue, Apr 24, 2012 at 7:06 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>>>> On Tue, Apr 24, 2012 at 5:24 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>>>> On Tue, Apr 24, 2012 at 5:10 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> Thanks for all the comments. I have made all the changes as
>>>>>>> mentioned and submiited the patch. Summary of changes made:
>>>>>>>
>>>>>>> * Add support for AVX
>>>>>>> * Fix documentation in extend.texi
>>>>>>> * Make it thread-safe according to H.J.'s comments.
>>>>>>>
>>>>>>> I have attached the patch. Boot-strapped and checked for test parity
>>>>>>> with pristine build.
>>>>>>>
>>>>>>> * config/i386/i386.c (build_processor_model_struct): New function.
>>>>>>> (make_var_decl): New function.
>>>>>>> (fold_builtin_cpu): New function.
>>>>>>> (ix86_fold_builtin): New function.
>>>>>>> (make_cpu_type_builtin): New function.
>>>>>>> (ix86_init_platform_type_builtins): New function.
>>>>>>> (ix86_expand_builtin): Expand new builtins by folding them.
>>>>>>> (ix86_init_builtins): Make new builtins to detect CPU type.
>>>>>>> (TARGET_FOLD_BUILTIN): New macro.
>>>>>>> (IX86_BUILTIN_CPU_INIT): New enum value.
>>>>>>> (IX86_BUILTIN_CPU_IS): New enum value.
>>>>>>> (IX86_BUILTIN_CPU_SUPPORTS): New enum value.
>>>>>>> * config/i386/i386-builtin-types.def: New function type.
>>>>>>> * testsuite/gcc.target/builtin_target.c: New testcase.
>>>>>>> * doc/extend.texi: Document builtins.
>>>>>>>
>>>>>>> * libgcc/config/i386/i386-cpuinfo.c: New file.
>>>>>>> * libgcc/config/i386/t-cpuinfo: New file.
>>>>>>> * libgcc/config.host: Include t-cpuinfo.
>>>>>>> * libgcc/config/i386/libgcc-glibc.ver: Version symbol __cpu_model.
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> + /* This function needs to run just once. */
>>>>>> + if (__cpu_model.__cpu_vendor)
>>>>>> + return 0;
>>>>>> +
>>>>>> + /* Assume cpuid insn present. Run in level 0 to get vendor id. */
>>>>>> + if (!__get_cpuid_output (0, &eax, &ebx, &ecx, &edx))
>>>>>> + return -1;
>>>>>>
>>>>>> If __get_cpuid_output returns non-zero, it will be called
>>>>>> repeatedly. I think you should set __cpu_model.__cpu_vendor
>>>>>> to non-zero in this case.
>>>>>
>>>>> Done now.
>>>>>
>>>>> 2012-04-24 Sriraman Tallam <tmsriram@google.com>
>>>>>
>>>>> * libgcc/config/i386/i386-cpuinfo.c: Set __cpu_vendor always.
>>>>>
>>>>>
>>>>> Index: libgcc/config/i386/i386-cpuinfo.c
>>>>> ===================================================================
>>>>> --- libgcc/config/i386/i386-cpuinfo.c (revision 186789)
>>>>> +++ libgcc/config/i386/i386-cpuinfo.c (working copy)
>>>>> @@ -256,16 +256,25 @@ __cpu_indicator_init (void)
>>>>>
>>>>> /* Assume cpuid insn present. Run in level 0 to get vendor id. */
>>>>> if (!__get_cpuid_output (0, &eax, &ebx, &ecx, &edx))
>>>>> - return -1;
>>>>> + {
>>>>> + __cpu_model.__cpu_vendor = VENDOR_OTHER;
>>>>> + return -1;
>>>>> + }
>>>>>
>>>>> vendor = ebx;
>>>>> max_level = eax;
>>>>>
>>>>> if (max_level < 1)
>>>>> - return -1;
>>>>> + {
>>>>> + __cpu_model.__cpu_vendor = VENDOR_OTHER;
>>>>> + return -1;
>>>>> + }
>>>>>
>>>>> if (!__get_cpuid_output (1, &eax, &ebx, &ecx, &edx))
>>>>> - return -1;
>>>>> + {
>>>>> + __cpu_model.__cpu_vendor = VENDOR_OTHER;
>>>>> + return -1;
>>>>> + }
>>>>>
>>>>> model = (eax >> 4) & 0x0f;
>>>>> family = (eax >> 8) & 0x0f;
>>>>>
>>>>>
>>>>> Thanks,
>>>>
>>>> Should you also handle AVX2?
>>>
>>> I cannot test it and thought will wait till I get access to a
>>> processor with AVX2.
>>>
>>
>> You can download an AVX2 emulator (SDE) from
>>
>> http://software.intel.com/en-us/avx/
>>
>> to test AVX2 binaries.
>
> Ok thanks, I will prepare a patch.
>
> -Sri.
>
>>
>> --
>> H.J.
Message from hjl.tools@gmail.com
2012-04-25T23:52:14+00:00hjl.tools_gmail.comurn:md5:07b032a0d83a23e18158defbf738c331
On Wed, Apr 25, 2012 at 4:38 PM, Sriraman Tallam <tmsriram@google.com> wrote:
> Hi H.J,
>
> Could you please review this patch for AVX2 check?
>
> * config/i386/i386-cpuinfo.c (FEATURE_AVX2): New enum value.
> (get_available_features): New argument. Check for AVX2.
> (__cpu_indicator_init): Modify call to get_available_features
> .
> * doc/extend.texi: Document avx2 support.
> * testsuite/gcc.target/i386/builtin_target.c: Check avx2.
> * config/i386/i386.c (fold_builtin_cpu): Add avx2.
>
It looks good to me.
Thanks.
--
H.J.
Message from tmsriram@google.com
2012-04-26T00:52:28+00:00Sriramanurn:md5:6f83a1b1d3172eaca9e64dca9430cd8d
Patch committed.
Thanks,
-Sri.
On Wed, Apr 25, 2012 at 4:52 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Wed, Apr 25, 2012 at 4:38 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>> Hi H.J,
>>
>> Could you please review this patch for AVX2 check?
>>
>> * config/i386/i386-cpuinfo.c (FEATURE_AVX2): New enum value.
>> (get_available_features): New argument. Check for AVX2.
>> (__cpu_indicator_init): Modify call to get_available_features
>> .
>> * doc/extend.texi: Document avx2 support.
>> * testsuite/gcc.target/i386/builtin_target.c: Check avx2.
>> * config/i386/i386.c (fold_builtin_cpu): Add avx2.
>>
>
> It looks good to me.
>
> Thanks.
>
> --
> H.J.
Message from tmsriram@google.com
2012-06-05T22:00:29+00:00Sriramanurn:md5:218980310091e32d6cb109092119ae01
Hi H.J.,
I am attaching a patch to add __cpu_indicator_init to the list of
symbols to be versioned and exported in libgcc_s.so. Also, updating
builtin_target.c test to explicitly do a CPUID and check if the
features are identified correctly like you had suggested earlier.
Patch ok?
* config/i386/libgcc-bsd.ver: Version symbol __cpu_indicator_init.
* config/i386/libgcc-sol2.ver: Ditto.
* config/i386/libgcc-glibc.ver: Ditto.
* gcc.target/i386/builtin_target.c (vendor_signatures): New enum.
(check_intel_cpu_model): New function.
(check_amd_cpu_model): New function.
(check_features): New function.
(__get_cpuid_output): New function.
(check_detailed): New function.
(fn1): Rename to quick_check.
(main): Update to call quick_check and call check_detailed.
Thanks,
-Sri.
On Wed, Apr 25, 2012 at 5:52 PM, Sriraman Tallam <tmsriram@google.com> wrote:
> Patch committed.
>
> Thanks,
> -Sri.
>
> On Wed, Apr 25, 2012 at 4:52 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Wed, Apr 25, 2012 at 4:38 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>> Hi H.J,
>>>
>>> Could you please review this patch for AVX2 check?
>>>
>>> * config/i386/i386-cpuinfo.c (FEATURE_AVX2): New enum value.
>>> (get_available_features): New argument. Check for AVX2.
>>> (__cpu_indicator_init): Modify call to get_available_features
>>> .
>>> * doc/extend.texi: Document avx2 support.
>>> * testsuite/gcc.target/i386/builtin_target.c: Check avx2.
>>> * config/i386/i386.c (fold_builtin_cpu): Add avx2.
>>>
>>
>> It looks good to me.
>>
>> Thanks.
>>
>> --
>> H.J.
Message from hjl.tools@gmail.com
2012-06-06T13:52:01+00:00hjl.tools_gmail.comurn:md5:adc5dae0531195f7c7a8a576d091fdd8
On Tue, Jun 5, 2012 at 3:00 PM, Sriraman Tallam <tmsriram@google.com> wrote:
> Hi H.J.,
>
> I am attaching a patch to add __cpu_indicator_init to the list of
> symbols to be versioned and exported in libgcc_s.so. Also, updating
> builtin_target.c test to explicitly do a CPUID and check if the
> features are identified correctly like you had suggested earlier.
>
> Patch ok?
>
>
> * config/i386/libgcc-bsd.ver: Version symbol __cpu_indicator_init.
> * config/i386/libgcc-sol2.ver: Ditto.
> * config/i386/libgcc-glibc.ver: Ditto.
>
>
> * gcc.target/i386/builtin_target.c (vendor_signatures): New enum.
> (check_intel_cpu_model): New function.
> (check_amd_cpu_model): New function.
> (check_features): New function.
> (__get_cpuid_output): New function.
> (check_detailed): New function.
> (fn1): Rename to quick_check.
> (main): Update to call quick_check and call check_detailed.
>
It looks good. The only problem is for C programs, __cpu_model and
__cpu_indicator_init in libgcc_s.so aren't used at all. I suggested
in
http://gcc.gnu.org/ml/gcc-patches/2012-05/msg01816.html
We can do one
of 3 things:
1. Abuse libgcc_eh.a by moving __cpu_model and __cpu_indicator_init
from libgcc.a to libgcc_eh.a.
2. Rename libgcc_eh.a to libgcc_static.a and move __cpu_model and
__cpu_indicator_init from libgcc.a to libgcc_static.a.
3. Add libgcc_static.a and move __cpu_model and __cpu_indicator_ini
from libgcc.a to libgcc_static.a. We treat libgcc_static.a similar to
libgcc_eh.a.
--
H.J.
Message from tmsriram@google.com
2012-06-06T15:04:22+00:00Sriramanurn:md5:5513ddea90f6a9fd11390be79bf9251a
On Jun 6, 2012 6:52 AM, "H.J. Lu" <hjl.tools@gmail.com> wrote:
>
> On Tue, Jun 5, 2012 at 3:00 PM, Sriraman Tallam <tmsriram@google.com>
wrote:
> > Hi H.J.,
> >
> > I am attaching a patch to add __cpu_indicator_init to the list of
> > symbols to be versioned and exported in libgcc_s.so. Also, updating
> > builtin_target.c test to explicitly do a CPUID and check if the
> > features are identified correctly like you had suggested earlier.
> >
> > Patch ok?
> >
> >
> > * config/i386/libgcc-bsd.ver: Version symbol
__cpu_indicator_init.
> > * config/i386/libgcc-sol2.ver: Ditto.
> > * config/i386/libgcc-glibc.ver: Ditto.
> >
> >
> > * gcc.target/i386/builtin_target.c (vendor_signatures): New enum.
> > (check_intel_cpu_model): New function.
> > (check_amd_cpu_model): New function.
> > (check_features): New function.
> > (__get_cpuid_output): New function.
> > (check_detailed): New function.
> > (fn1): Rename to quick_check.
> > (main): Update to call quick_check and call check_detailed.
> >
>
> It looks good. The only problem is for C programs, __cpu_model and
> __cpu_indicator_init in libgcc_s.so aren't used at all. I suggested
> in
>
> http://gcc.gnu.org/ml/gcc-patches/2012-05/msg01816.html
>
> We can do one
> of 3 things:
>
> 1. Abuse libgcc_eh.a by moving __cpu_model and __cpu_indicator_init
> from libgcc.a to libgcc_eh.a.
> 2. Rename libgcc_eh.a to libgcc_static.a and move __cpu_model and
> __cpu_indicator_init from libgcc.a to libgcc_static.a.
> 3. Add libgcc_static.a and move __cpu_model and __cpu_indicator_ini
> from libgcc.a to libgcc_static.a. We treat libgcc_static.a similar to
> libgcc_eh.a.
>
Right, I should have mentioned that I am doing this in another patch.option
1 sounds easy but option 3 sounds best.
Thanks
-Sri
>
> --
> H.J.
Message from tmsriram@google.com
2012-06-12T02:56:32+00:00Sriramanurn:md5:48947945b0208a45942ff762f0c9efa2
On Wed, Jun 6, 2012 at 6:52 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Tue, Jun 5, 2012 at 3:00 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>> Hi H.J.,
>>
>> I am attaching a patch to add __cpu_indicator_init to the list of
>> symbols to be versioned and exported in libgcc_s.so. Also, updating
>> builtin_target.c test to explicitly do a CPUID and check if the
>> features are identified correctly like you had suggested earlier.
>>
>> Patch ok?
>>
>>
>> * config/i386/libgcc-bsd.ver: Version symbol __cpu_indicator_init.
>> * config/i386/libgcc-sol2.ver: Ditto.
>> * config/i386/libgcc-glibc.ver: Ditto.
>>
>>
>> * gcc.target/i386/builtin_target.c (vendor_signatures): New enum.
>> (check_intel_cpu_model): New function.
>> (check_amd_cpu_model): New function.
>> (check_features): New function.
>> (__get_cpuid_output): New function.
>> (check_detailed): New function.
>> (fn1): Rename to quick_check.
>> (main): Update to call quick_check and call check_detailed.
>>
>
> It looks good.
I submitted this patch. I am working on fixing the problem mentioned
below about cpu indicator symbols from libgcc_s.so being used in C
programs.
Thanks,
-Sri.
The only problem is for C programs, __cpu_model and
> __cpu_indicator_init in libgcc_s.so aren't used at all. I suggested
> in
>
> http://gcc.gnu.org/ml/gcc-patches/2012-05/msg01816.html
>
> We can do one
> of 3 things:
>
> 1. Abuse libgcc_eh.a by moving __cpu_model and __cpu_indicator_init
> from libgcc.a to libgcc_eh.a.
> 2. Rename libgcc_eh.a to libgcc_static.a and move __cpu_model and
> __cpu_indicator_init from libgcc.a to libgcc_static.a.
> 3. Add libgcc_static.a and move __cpu_model and __cpu_indicator_ini
> from libgcc.a to libgcc_static.a. We treat libgcc_static.a similar to
> libgcc_eh.a.
>
>
> --
> H.J.