Ticket #1731 (assigned bug)

Opened 4 years ago

Last modified 4 years ago

Assumption made about buffer header alignment

Reported by: Paul C. Anagnostopoulos Owned by: Paul C. Anagnostopoulos
Priority: normal Milestone:
Component: core Version: 2.6.0
Severity: low Keywords:
Cc: Language:
Patch status: Platform:

Description

Various memory management routines (e.g., gc_ms_allocate_buffer_storage) assume that the size of a buffer header is equal to the size of a pointer. This is probably true throughout the system as it stands, but those same routines take pains not to make that assumption in other places.

Here is a line from the above routine:

    Buffer_buflen(buffer)   = new_size - sizeof (void *);

If the size of a pointer is less than the buffer header size, the value stored in buflen will be too big. new_size includes the entire size of the buffer header, which may include alignment padding in addition to the pointer.

Change History

  Changed 4 years ago by Paul C. Anagnostopoulos

  • owner set to Paul C. Anagnostopoulos
  • status changed from new to assigned

  Changed 4 years ago by doughera

On Thu, 5 Aug 2010, Parrot wrote:

>  Various memory management routines (e.g., `gc_ms_allocate_buffer_storage`)
>  assume that the size of a buffer header is equal to the size of a pointer.
>  This is probably true throughout the system as it stands, but those same
>  routines take pains not to make that assumption in other places.
> 
>  Here is a line from the above routine:
>  {{{
>      Buffer_buflen(buffer)   = new_size - sizeof (void *);
>  }}}
>  If the size of a pointer is less than the buffer header size, the value
>  stored in `buflen` will be too big. `new_size` includes the entire size of
>  the buffer header, which may include alignment padding in addition to the
>  pointer.

I don't understand what you are saying.  A Buffer looks like this
(include/parrot/pobj.h):

typedef struct buffer_t {
    Parrot_UInt flags;
    void *     _bufstart;
    size_t     _buflen;
} Buffer;

Which part, specifically, is the "header"? Or are you referring to 
something else?

-- 
    Andy Dougherty		doughera@lafayette.edu


follow-up: ↓ 4   Changed 4 years ago by Paul C. Anagnostopoulos

Sorry, confusing terminology. The buffer "header" is the information stored in front of the actual buffer memory. The Buffer object is the buffer "descriptor." Here's the new description from pobj.h:

/* A buffer descriptor object points to a buffer in a Memory_Block.
   The buffer includes a header, but _bufstart points to the data
   portion. Here is how it works:

    Buffer descriptor                     buffer
   +-------------------+                 +------------------------+
   |       flags       |                 |  (possible padding)    | }
   +-------------------+                 +---------------------+--+  > header
   |      _bufstart    | ------+         |    *Memory_Block    |fl| }
   +-------------------+       |         +---------------------+--+
   |      _buflen      |       +-------> |    data portion        |
   +-------------------+                 |                        |
                                         ~                        ~
                                         |                        |
                                         +------------------------+

   The buffer header consists of possible padding and a pointer to the
   Memory_Block containing the buffer. There are two flags in the low-order
   bits of the pointer (see string.h). Padding is only required if the
   alignment of the data portion is higher than that of a pointer.
   This was not the case as of 8/2010.
*/

in reply to: ↑ 3   Changed 4 years ago by doughera

Replying to Paul C. Anagnostopoulos:

Sorry, confusing terminology. The buffer "header" is the information stored in front of the actual buffer memory. The Buffer object is the buffer "descriptor."

Ah, I see. Thanks for the new description. I had hoped that with the new memory organization, that hidden flag stuck before _bufstart could have been eliminated. I have never been a fan of that trick, and prefer to tell the compiler explicitly what I'm doing so it automatically gets all the alignment correct.

Does every Buffer have that Memory_Block element? If so, would it make sense to just explicitly list it in the Buffer "descriptor?"

Padding is only required if the alignment of the data portion is higher than that of a pointer. This was not the case as of 8/2010.

It depends on how the memory blocks are obtained and carved up. On 32-bit SPARC, doubles should be aligned at 8-byte boundaries, but pointers are only 4 bytes. I'm not sure how parrot obtains and carves up space for those buffers.

  Changed 4 years ago by Paul C. Anagnostopoulos

Now that I think about my terminology, it runs contrary to the terminology used elsewhere. I'll fix that in the comments.

I believe that every memory buffer has the Memory_Block in its header, although I'm not positive. In order to move it to the Buffer object, we would have to do a careful analysis to see if any functions grab the Memory_Block directly from a memory buffer rather than via its Buffer object.

Buffers are currently aligned using WORD_ALIGN_1, which is based on the size of a pointer. That's why this problem won't arise now. But there is BUFFER_ALIGN_1 lurking around and it would align on 8-byte boundaries if it were used. This is odd, since a comment claims that buffers are aligned for FLOATVALs. Either the comment is wrong or I don't understand what is going on.

  Changed 4 years ago by Paul C. Anagnostopoulos

The Buffer descriptor is indeed called the header, so I'll now refer to the stuff preceding the actual buffer memory as the prolog.

follow-up: ↓ 8   Changed 4 years ago by Paul C. Anagnostopoulos

We have two options for fixing the faulty assumption described in the original ticket.

One approach is to simply declare by fiat that the size of the buffer prolog will always equal the size of a pointer. This means that the buffer data area will always be aligned on the same alignment as a pointer; there will be no possibility of any other alignment.

The second approach is actually to fix the problem. This means that the calculation of the buffer alignment will become more complex, which affects about a dozen functions in three or four files.

Which approach should we take?

in reply to: ↑ 7   Changed 4 years ago by doughera

Replying to Paul C. Anagnostopoulos:

Which approach should we take?

It depends on what buffers are supposed to be able to store. If they are supposed to be able to store arbitrary items (e.g. FLOATVALS, or structures) then they must be aligned properly for those structures, which may require padding. If they are not supposed to be able to store arbitrary items, then less strict alingment may be appropriate.

I was not able to find any documentation for what they are supposed to be able to store. Many of the comments in the source about alignment date back to times when there was a "cache" element in most of the structures that could hold a FLOATVAL, and may not apply anymore.

Note: See TracTickets for help on using tickets.