Rietveld Code Review Tool
Help | Bug tracker | Discussion group | Source code | Sign in
(6436)

Issue 4078044: [pph] Remove code to handle lexing from arbitrary offsets (Closed)

Can't Edit
Can't Publish+Mail
Start Review
Created:
13 years, 3 months ago by Diego Novillo
Modified:
13 years, 2 months ago
Reviewers:
Lawrence Crowl
CC:
gcc-patches_gcc.gnu.org
Base URL:
svn+ssh://gcc.gnu.org/svn/gcc/branches/pph/
Visibility:
Public.

Description

This patch removes the code that used to handle mixing token hunks coming from text files with those coming from images. This simplifies some of the logic since the idea is to only tolerate header files that can be completely converted into images. In addition, it XFAILs 3 tests that are currently triggering this error. Lawrence, I believe these are due to the hunk verifier getting somehow confuesd by __need_size_t in /usr/include/sys/types.h. I've looked at the failure with -fpth-debug=1 and I don't see why it gets confused. It's expecting __need_size_t to *not* be defined, but it is defined by the file a few lines above the failure: [ ... /usr/include/sys/types.h ... ] ... #define __need_size_t #include <stddef.h> #ifdef __USE_MISC /* Old compatibility names for C types. */ typedef unsigned long int ulong; typedef unsigned short int ushort; typedef unsigned int uint; #endif ... The hunk starting with 'typedef unsigned long int ulong' fails with PTH: _usr_include_sys_types_h.pth failed verification: __need_size_t : <__need_size_t > -> <(null)> /home/dnovillo/pph/src/gcc/testsuite/g++.dg/pph/mean.cc:1:0: fatal error: Found an invalid hunk in /usr/include/sys/types.h. This header file cannot be converted into a pre-parsed image. To reproduce, do 'make check-g++ RUNTESTFLAGS=pph.exp' and search for '^XFAIL' in testsuite/g++/g++.log, that will give you the command line to reproduce the failure. No failures other than those. Tested on x86_64. Diego. libcpp/ChangeLog.pph 2011-01-27 Diego Novillo <dnovillo@google.com> * include/cpplib.h (cpp_reset_lexer_state, cpp_restore_lexer_state): Remove. * lex.c: Likewise. Update all users. gcc/cp/ChangeLog.pph 2011-01-27 Diego Novillo <dnovillo@google.com> * Make-lang.in (cp/parser.o): Fix dependencies. * pph.c (pth_process_text_file): Do not try to start reading the file at an arbitrary offset. Remove OFFSET argument. Update all users. (pth_image_to_lexer): Emit a fatal error if a hunk fails validation. (pth_lexer_to_image): (pth_enter_file): * pph.h (struct cp_token_hunk): Remove fields text_offset and text_length. Update all users. (struct pth_image): Remove field hunk_text_offset. Update all users. testsuite/ChangeLog.pph 2011-01-27 Diego Novillo <dnovillo@google.com> * g++.dg/pph/system-include.cc: xfail for invalid hunks. * g++.dg/pph/mean.cc: Likewise. * g++.dg/pph/sys-types.cc: Likewise. * g++.dg/pph/variables.cc: Fix error about initializing an extern.

Patch Set 1 #

Unified diffs Side-by-side diffs Delta from patch set Stats (+38 lines, -197 lines) Patch
M gcc/cp/Make-lang.in View 1 chunk +1 line, -3 lines 0 comments Download
M gcc/cp/pph.h View 2 chunks +0 lines, -15 lines 0 comments Download
M gcc/cp/pph.c View 15 chunks +30 lines, -149 lines 0 comments Download
M gcc/testsuite/g++.dg/pph/mean.cc View 2 chunks +2 lines, -1 line 0 comments Download
M gcc/testsuite/g++.dg/pph/sys-types.cc View 1 chunk +2 lines, -1 line 0 comments Download
M gcc/testsuite/g++.dg/pph/system-include.cc View 1 chunk +2 lines, -1 line 0 comments Download
M gcc/testsuite/g++.dg/pph/variables.cc View 1 chunk +1 line, -1 line 0 comments Download
M libcpp/include/cpplib.h View 1 chunk +0 lines, -2 lines 0 comments Download
M libcpp/lex.c View 1 chunk +0 lines, -24 lines 0 comments Download

Messages

Total messages: 2
Diego Novillo
13 years, 3 months ago (2011-01-27 23:30:26 UTC) #1
Diego Novillo
13 years, 3 months ago (2011-01-28 13:06:40 UTC) #2
Original patch:

> Index: gcc/testsuite/g++.dg/pph/system-include.cc
> ===================================================================
> --- gcc/testsuite/g++.dg/pph/system-include.cc	(revision 169332)
> +++ gcc/testsuite/g++.dg/pph/system-include.cc	(working copy)
> @@ -1,2 +1,3 @@
> -#include <stdlib.h>
> +#include <stdlib.h>	// { dg-error "fatal" "invalid hunk" { xfail *-*-* } }
>  size_t X;
> +// { dg-error "excess errors" "" { xfail *-*-* } }
> Index: gcc/testsuite/g++.dg/pph/variables.cc
> ===================================================================
> --- gcc/testsuite/g++.dg/pph/variables.cc	(revision 169332)
> +++ gcc/testsuite/g++.dg/pph/variables.cc	(working copy)
> @@ -1,4 +1,4 @@
> -extern int gbl_init_extern = 3;		// need body			pass
> +extern int gbl_init_extern;		// need body			pass
>  extern int gbl_uninit_extern;		// head only			pass
>  int gbl_tentative;			// need body			pass
>  int gbl_initial = 1;			// need body			pass
> Index: gcc/testsuite/g++.dg/pph/mean.cc
> ===================================================================
> --- gcc/testsuite/g++.dg/pph/mean.cc	(revision 169332)
> +++ gcc/testsuite/g++.dg/pph/mean.cc	(working copy)
> @@ -1,4 +1,4 @@
> -#include <stdlib.h>
> +#include <stdlib.h>	// { dg-error "fatal" "invalid hunk" { xfail *-*-* } }
>  #include <stdio.h>
>  #include <math.h>
>  #include <string.h>
> @@ -160,3 +160,4 @@ main (int argc, char *argv[])
>
>    return 0;
>  }
> +// { dg-error "excess errors" "" { xfail *-*-* } }
> Index: gcc/testsuite/g++.dg/pph/sys-types.cc
> ===================================================================
> --- gcc/testsuite/g++.dg/pph/sys-types.cc	(revision 169332)
> +++ gcc/testsuite/g++.dg/pph/sys-types.cc	(working copy)
> @@ -1 +1,2 @@
> -#include <sys/types.h>
> +#include <sys/types.h> 	// { dg-error "fatal" "invalid hunk" { xfail *-*-* }
}
> +// { dg-error "excess errors" "" { xfail *-*-* } }
> Index: gcc/cp/Make-lang.in
> ===================================================================
> --- gcc/cp/Make-lang.in	(revision 169332)
> +++ gcc/cp/Make-lang.in	(working copy)
> @@ -319,9 +319,7 @@ cp/mangle.o: cp/mangle.c $(CXX_TREE_H) $
>    gt-cp-mangle.h $(TARGET_H) $(TM_P_H) $(CGRAPH_H)
>  cp/parser.o: cp/parser.c $(CXX_TREE_H) $(TM_H) $(DIAGNOSTIC_CORE_H) \
>    gt-cp-parser.h output.h $(TARGET_H) $(PLUGIN_H) intl.h \
> -  $(TIMEVAR_H) c-family/c-objc.h \
> -  pointer-set.h fixed-value.h $(MD5_H) $(HASHTAB_H) tree-pass.h \
> -  $(TREE_INLINE_H) tree-pretty-print.h $(CXX_PARSER_H) \
> +  c-family/c-objc.h tree-pretty-print.h $(CXX_PARSER_H) \
>    $(CXX_PPH_H)
>  cp/cp-gimplify.o: cp/cp-gimplify.c $(CXX_TREE_H) $(C_COMMON_H) \
>  	$(TM_H) coretypes.h pointer-set.h tree-iterator.h
> Index: gcc/cp/pph.c
> ===================================================================
> --- gcc/cp/pph.c	(revision 169332)
> +++ gcc/cp/pph.c	(working copy)
> @@ -326,17 +326,6 @@ pth_write_uint (unsigned int value, FILE
>  }
>
>
> -/* Write a size_t VALUE to the STREAM.  Return the number of bytes written. 
*/
> -
> -static inline size_t
> -pth_write_sizet (size_t value, FILE *stream)
> -{
> -  size_t sent = fwrite (&value, 1, sizeof (value), stream);
> -  gcc_assert (sent == sizeof (value));
> -  return sent;
> -}
> -
> -
>  /* Write N bytes from P to STREAM.  */
>
>  static inline size_t
> @@ -564,9 +553,6 @@ pth_debug_identifiers (cpp_idents_used *
>  static void
>  pth_dump_hunk (FILE *stream, cp_token_hunk *hunk)
>  {
> -  fprintf (stream, "Hunk location: { %lu, %lu, %lu }\n",
hunk->text_offset.cur,
> -	   hunk->text_offset.line_base, hunk->text_offset.next_line);
> -  fprintf (stream, "Hunk length:   %lu characters\n", hunk->text_length);
>    pth_dump_identifiers (stream, &hunk->identifiers);
>    cp_lexer_dump_tokens (stream, hunk->buffer, 0);
>  }
> @@ -846,14 +832,6 @@ pth_save_hunk (cp_token_hunk *hunk, FILE
>    /* Write out the identifiers used by HUNK.  */
>    pth_save_identifiers (&hunk->identifiers, stream);
>
> -  /* Write the offset into the text file where this token starts.  */
> -  pth_write_sizet (hunk->text_offset.cur, stream);
> -  pth_write_sizet (hunk->text_offset.line_base, stream);
> -  pth_write_sizet (hunk->text_offset.next_line, stream);
> -
> -  /* Write the length of the hunk.  */
> -  pth_write_sizet (hunk->text_length, stream);
> -
>    /* Write the number of tokens in HUNK.  */
>    pth_write_uint (VEC_length (cp_token, hunk->buffer), stream);
>
> @@ -971,16 +949,6 @@ pth_read_uint (unsigned int *var_p, FILE
>  }
>
>
> -/* Read a size_t into *VAR_P.  */
> -
> -static void
> -pth_read_sizet (size_t *var_p, FILE *stream)
> -{
> -  size_t received = fread (var_p, sizeof *var_p, 1, stream);
> -  gcc_assert (received == 1);
> -}
> -
> -
>  /* Read N bytes into P from STREAM.  The caller is responsible
>     for allocating sufficient memory for P.  */
>
> @@ -1204,14 +1172,6 @@ pth_load_hunk (pth_image *image, FILE *s
>    /* Setup the identifier list.  */
>    pth_load_identifiers (&hunk->identifiers, stream);
>
> -  /* Read the offset into the text file where this token starts.  */
> -  pth_read_sizet (&hunk->text_offset.cur, stream);
> -  pth_read_sizet (&hunk->text_offset.line_base, stream);
> -  pth_read_sizet (&hunk->text_offset.next_line, stream);
> -
> -  /* Read the text length of the hunk.  */
> -  pth_read_sizet (&hunk->text_length, stream);
> -
>    /* Read the number of tokens in HUNK. */
>    pth_read_uint (&num_tokens, stream);
>
> @@ -1446,7 +1406,7 @@ pth_image_lookup (pth_state *state, cons
>     the memory image holding HUNK.  */
>
>  static void
> -pth_append_hunk (cp_lexer *lexer, pth_image *image, cp_token_hunk *hunk)
> +pth_append_hunk (cp_lexer *lexer, cp_token_hunk *hunk)
>  {
>    cp_token *lexer_addr, *hunk_addr;
>    unsigned lexer_len, hunk_len;
> @@ -1456,12 +1416,6 @@ pth_append_hunk (cp_lexer *lexer, pth_im
>    /* Apply all the identifiers used and defined by HUNK.  */
>    cpp_lt_replay (parse_in, &hunk->identifiers);
>
> -  /* If this file has been read in memory (e.g., by being #included
> -     from a tainted file), advance the text buffer pointer to the end
> -     of the hunk to avoid reading it again.  */
> -  if (image->buffer)
> -    cpp_set_pos (image->buffer, hunk->text_offset);
> -
>    hunk_len = VEC_length (cp_token, hunk->buffer);
>
>    /* Some hunks have no tokens and they are only useful for the
> @@ -1503,6 +1457,7 @@ pth_hunk_is_valid_p (pth_image *image, c
>    verified = cpp_lt_verify (reader, &hunk->identifiers, &bad_use, &cur_def);
>    if (!verified && flag_pth_debug >= 1)
>      {
> +      pth_debug_hunk (hunk);
>        fprintf (stderr, "PTH: %s failed verification: %s : <%s> -> <%s>\n",
>                           pth_name_for (image->fname), bad_use->ident_str,
>                           bad_use->before_str, cur_def);
> @@ -1551,89 +1506,46 @@ pth_get_dir_and_name (const char *name,
>     #included originally.  Otherwise, it is assumed to be an
>     include command done with the flag -include.
>
> -   This is used when an image is found to be tainted and tokens
> -   need to be read from the original character stream.  OFFSET
> -   indicates how far into the character stream to start reading at.  */
> +   This is used when we need to process an #include command from
> +   a PPH image and the file to be included is a regular text file.  */
>
>  static void
> -pth_process_text_file (cp_lexer *lexer, pth_image *image, pth_include
*include,
> -		       cpp_offset offset)
> +pth_process_text_file (cp_lexer *lexer, pth_image *image, pth_include
*include)
>  {
> -  bool prev_permissive, prev_inhibit_warnings;
>    bool pushed_p;
>    cpp_buffer *buffer;
> -  lexer_state *state;
>
>    /* Emulate a #include directive on IMAGE->FNAME, if needed.  Note
>       that if we are already inside the CPP buffer for IMAGE->FNAME
>       we should not include it again, since this will cause another
>       call to pth_file_change which will again register IMAGE->FNAME as
>       an include for the parent file.  */
> -  state = NULL;
> -  if (image->buffer != cpp_get_buffer (parse_in))
> -    {
> -      if (include == NULL || include->itype == IT_INCLUDE_NEXT)
> -	{
> -	  char *dname;
> -	  const char *fname;
> -	  pth_get_dir_and_name (image->fname, &dname, &fname);
> -	  pushed_p = cpp_push_include_type (parse_in, dname, fname, false,
> -					    IT_INCLUDE);
> -	
> -	  /* FIXME pph.  We are leaking DNAME here.  libcpp
> -	     wants the directory name in permanent storage so we
> -	     cannot free it, but we should put it in an obstack
> -	     so it can be reclaimed at some point.  */
> -	}
> -      else
> -	pushed_p = cpp_push_include_type (parse_in,
> -					  include->dname,
> -					  include->iname,
> -					  include->angle_brackets,
> -					  include->itype);
> +  gcc_assert (image->buffer != cpp_get_buffer (parse_in));
>
> -      if (!pushed_p)
> -	return;
> -    }
> -  else
> +  if (include == NULL || include->itype == IT_INCLUDE_NEXT)
>      {
> -      /* Since we already have the buffer on top of the stack,
> -	 reset the state of the lexer to avoid skipping over it.  */
> -      state = cpp_reset_lexer_state (parse_in);
> +      char *dname;
> +      const char *fname;
> +      pth_get_dir_and_name (image->fname, &dname, &fname);
> +      /* FIXME pph.  We are leaking DNAME here.  libcpp
> +	 wants the directory name in permanent storage so we
> +	 cannot free it, but we should put it in an obstack
> +	 so it can be reclaimed at some point.  */
> +      pushed_p = cpp_push_include_type (parse_in, dname, fname, false,
> +	                                IT_INCLUDE);
>      }
> +  else
> +    pushed_p = cpp_push_include_type (parse_in, include->dname,
include->iname,
> +				      include->angle_brackets, include->itype);
>
> -  /* Inhibit libcpp error messages (ignore things like unmatched #endifs).
> -     We need to do this because token hunks may span #if/#endif boundaries.
> -     For instance,
> -
> -     	#if <COND>
> -	  H1 [1]
> -	#else
> -	  H2 [2]
> -	  #include "foo.h"
> -	  H3 [3]
> -	#endif
> -	H1 or H3 [4]
> -
> -     In this example we have 3 hunks (H1, H2 and H3).  Depending on the
> -     value of <COND> at image creation time, point [4] will belong to
> -     hunk H1 (if <COND> was true) or hunk H3 (if <COND> was false).
> -
> -     When this image is loaded, if H3 cannot be applied because its
> -     dependences fail to hold, then we will start pre-processing the
> -     text for the file at the start of hunk H3.  This will find a
> -     dangling #endif which triggers a libcpp error.  Since we know
> -     that we are in the correct arm of the #if (after all H2 was
> -     applied from the image), we can safely ignore the error.  */
> -  prev_permissive = global_dc->permissive;
> -  prev_inhibit_warnings = global_dc->dc_inhibit_warnings;
> -  global_dc->dc_inhibit_warnings = true;
> -  global_dc->permissive = true;
> +  /* Nothing else to do if libcpp decided it did not need the file.  */
> +  if (!pushed_p)
> +    return;
>
> -  /* Position the reader at OFFSET and request to stop reading at
> -     the end of it.  */
> +  /* Position the reader at the start of the buffer and request to
> +     stop reading at the end of it.  */
>    buffer = cpp_get_buffer (parse_in);
> -  cpp_set_pos (buffer, offset);
> +  cpp_set_pos (buffer, cpp_buffer_start);
>    cpp_return_at_eof (buffer, true);
>
>    /* Get tokens from IMAGE->FNAME.  */
> @@ -1643,14 +1555,6 @@ pth_process_text_file (cp_lexer *lexer,
>       will now contain an EOF, which we do not need.  */
>    VEC_pop (cp_token, lexer->buffer);
>    cpp_return_at_eof (buffer, false);
> -
> -  /* Restore libcpp error/warnings.  */
> -  global_dc->permissive = prev_permissive;
> -  global_dc->dc_inhibit_warnings = prev_inhibit_warnings;
> -
> -  /* Restore the parser state, if necessary.  */
> -  if (state)
> -    cpp_restore_lexer_state (parse_in, state);
>  }
>
>
> @@ -1702,21 +1606,11 @@ pth_image_to_lexer (cp_lexer *lexer, pth
>
>  	  hunk = VEC_index (cp_token_hunk_ptr, image->token_hunks, h_ix++);
>  	  if (pth_hunk_is_valid_p (image, hunk, parse_in))
> -	    pth_append_hunk (lexer, image, hunk);
> +	    pth_append_hunk (lexer, hunk);
>  	  else
> -	    {
> -	      PTH_STATS_INCR (invalid_hunks, 1);
> -
> -	      pth_process_text_file (lexer, image, NULL, hunk->text_offset);
> -
> -	      /* Since this hunk is invalid, assume that everything
> -		 downstream from this hunk is also invalid.  FIXME pph,
> -		 it may be possible to optimize this.  We should be
> -		 able to pre-process from text exactly
> -		 HUNK->TEXT_LENGTH characters instead of the whole
> -		 file.  */
> -	      break;
> -	    }
> +	    fatal_error ("Found an invalid hunk in %s.  This header file "
> +			 "cannot be converted into a pre-parsed image.",
> +			 image->fname);
>  	}
>        else if (s == 'I')
>  	{
> @@ -1726,8 +1620,7 @@ pth_image_to_lexer (cp_lexer *lexer, pth
>  	  if (pth_image_can_be_used (incdir->image))
>  	    pth_image_to_lexer (lexer, incdir->image, incdir);
>  	  else
> -	    pth_process_text_file (lexer, incdir->image, incdir,
> -				   cpp_buffer_start);
> +	    pth_process_text_file (lexer, incdir->image, incdir);
>  	}
>        else
>  	gcc_unreachable ();
> @@ -1767,7 +1660,6 @@ pth_lexer_to_image (pth_image *image, cp
>    cp_token *lexer_addr, *hunk_addr;
>    cp_token_hunk *hunk;
>    unsigned num_tokens, start_ix, end_ix;
> -  cpp_offset pos;
>
>    /* Create a new token hunk.  */
>    hunk = ggc_alloc_cleared_cp_token_hunk ();
> @@ -1777,12 +1669,6 @@ pth_lexer_to_image (pth_image *image, cp
>    /* The identifiers that may conflict with macros.  */
>    hunk->identifiers = cpp_lt_capture (reader);
>
> -  /* Remember the text offset where this hunk started and its length.  */
> -  hunk->text_offset = image->hunk_text_offset;
> -  pos = cpp_get_pos (image->buffer);
> -  gcc_assert (pos.cur >= hunk->text_offset.cur);
> -  hunk->text_length = pos.cur - hunk->text_offset.cur;
> -
>    /* Compute the bounds for the new token hunk.  */
>    start_ix = image->hunk_start_ix;
>    end_ix = VEC_length (cp_token, lexer->buffer);
> @@ -1926,11 +1812,6 @@ pth_enter_file (cpp_reader *reader, pth_
>       file image will be at the current last slot in
>       STATE->LEXER->BUFFER.  */
>    image->hunk_start_ix = VEC_length (cp_token, state->lexer->buffer);
> -
> -  /* The new hunk starts at the current offset in the current
> -     libcpp buffer.  If this hunk is ever invalidated, this is
> -     the offset at which to start pre-processing.  */
> -  image->hunk_text_offset = cpp_get_pos (image->buffer);
>  }
>
>
> Index: gcc/cp/pph.h
> ===================================================================
> --- gcc/cp/pph.h	(revision 169332)
> +++ gcc/cp/pph.h	(working copy)
> @@ -35,17 +35,6 @@ typedef struct GTY(()) cp_token_hunk
>
>    /* The array of tokens.  */
>    VEC(cp_token,gc) *buffer;
> -
> -  /* Offset into the libcpp file buffer where this token hunk starts at.
> -     If this token hunk is invalidated, the lexer is repositioned at
> -     this offset into the file to start lexing again.  */
> -  cpp_offset text_offset;
> -
> -  /* Length of this hunk in characters.  When the tokens in this hunk
> -     are moved into the lexer buffer, the original character stream
> -     is advanced this many characters to avoid future lexing from
> -     seeing these tokens again.  */
> -  size_t text_length;
>  } cp_token_hunk;
>
>  typedef struct cp_token_hunk *cp_token_hunk_ptr;
> @@ -153,10 +142,6 @@ typedef struct GTY(()) pth_image
>       logic in pth_file_change.  */
>    unsigned hunk_start_ix;
>
> -  /* Offset into the libcpp character buffer where the next token
> -     hunk will be read from.  */
> -  cpp_offset hunk_text_offset;
> -
>    /* libcpp buffer associated with IMAGE->FNAME.  */
>    cpp_buffer * GTY((skip)) buffer;
>  } pth_image;
> Index: libcpp/include/cpplib.h
> ===================================================================
> --- libcpp/include/cpplib.h	(revision 169332)
> +++ libcpp/include/cpplib.h	(working copy)
> @@ -919,8 +919,6 @@ extern const char *cpp_type2name (enum c
>     string literal.  Handles all relevant diagnostics.  */
>  extern cppchar_t cpp_parse_escape (cpp_reader *, const unsigned char ** pstr,
>  				   const unsigned char *limit, int wide);
> -extern lexer_state *cpp_reset_lexer_state (cpp_reader *);
> -extern void cpp_restore_lexer_state (cpp_reader *, lexer_state *);
>
>  /* Structure used to hold a comment block at a given location in the
>     source code.  */
> Index: libcpp/lex.c
> ===================================================================
> --- libcpp/lex.c	(revision 169332)
> +++ libcpp/lex.c	(working copy)
> @@ -2841,27 +2841,3 @@ cpp_token_val_index (cpp_token *tok)
>        return CPP_TOKEN_FLD_NONE;
>      }
>  }
> -
> -/* Reset the lexer state in PFILE and return its previous setting.  */
> -
> -lexer_state *
> -cpp_reset_lexer_state (cpp_reader *pfile)
> -{
> -  lexer_state *s;
> -
> -  s = (lexer_state *) xmalloc (sizeof (lexer_state));
> -  memcpy (s, &pfile->state, sizeof (pfile->state));
> -  memset (&pfile->state, 0, sizeof (pfile->state));
> -
> -  return s;
> -}
> -
> -
> -/* Restore the lexer state in PFILE to S.  */
> -
> -void
> -cpp_restore_lexer_state (cpp_reader *pfile, lexer_state *s)
> -{
> -  memcpy (&pfile->state, s, sizeof (pfile->state));
> -  free (s);
> -}

Sign in to reply to this message.

Powered by Google App Engine
RSS Feeds Recent Issues | This issue
This is Rietveld f62528b