Skip to content
Snippets Groups Projects
  1. Mar 19, 2010
  2. Mar 13, 2010
    • Shawn Pearce's avatar
      Merge branch 'push-sideband' into stable-0.7 · 23bd331c
      Shawn Pearce authored
      * push-sideband:
        Reuse the line buffer between strings in PacketLineIn
        http.server: Use TemporaryBuffer and compress some responses
        Reduce multi-level buffered streams in transport code
        Fix smart HTTP client buffer alignment
        Use "ERR message" for early ReceivePack problems
        Catch and report "ERR message" during remote advertisements
        Wait for EOF on stderr before finishing SSH channel
        Capture non-progress side band #2 messages and put in result
        ReceivePack: Enable side-band-64k capability for status reports
        Use more restrictive patterns for sideband progress scraping
        Prefix remote progress tasks with "remote: "
        Decode side-band channel number as unsigned integer
        Refactor SideBandInputStream construction
        Refactor SideBandOutputStream to be buffered
      
      Change-Id: Ic9689e64e8c87971f2fd402cb619082309d5587f
      23bd331c
    • Shawn Pearce's avatar
      Reuse the line buffer between strings in PacketLineIn · 89cdc3b7
      Shawn Pearce authored
      
      When reading pkt-lines off an InputStream we are quite likely to
      consume a whole group of fairly short lines in rapid succession, such
      as in the have exchange that occurs in the fetch-pack/upload-pack
      protocol.  Rather than allocating a throwaway buffer for each
      line's raw byte sequence, reuse a buffer that is equal to the small
      side-band packet size, which is 1000 bytes.  Text based pkt-lines
      are required to be less than this size because many widely deployed
      versions of C Git use a statically allocated array of this length.
      
      Change-Id: Ia5c8e95b85020f7f80b6d269dda5059b092d274d
      Signed-off-by: default avatarShawn O. Pearce <spearce@spearce.org>
      89cdc3b7
    • Shawn Pearce's avatar
      http.server: Use TemporaryBuffer and compress some responses · c0f09389
      Shawn Pearce authored
      
      The HTTP server side code now uses the same approach that the smart
      HTTP client code uses when preparing a request body.  The payload
      is streamed into a TemporaryBuffer of limited size.  If the entire
      data fits, its compressed with gzip if the user agent supports that,
      and a Content-Length header is used to transmit the fixed length
      body to the peer.  If however the data overflows the limited memory
      segment, its streamed uncompressed to the peer.
      
      One might initially think that larger contents which overflow
      the buffer should also be compressed, rather than sent raw, since
      they were deemed "large".  But usually these larger contents are
      actually a pack file which has been already heavily compressed by
      Git specific routines.  Trying to deflate that with gzip is probably
      going to take up more space, not less, so the compression overhead
      isn't worthwhile.
      
      This buffer and compress optimization helps repositories with a
      large number of references, as their text based advertisements
      compress well. For example jgit's own native repository currently
      requires 32,628 bytes for its full advertisement of 489 references.
      Most repositories have fewer references, and thus could compress
      their entire response in one buffer.
      
      Change-Id: I790609c9f763339e0a1db9172aa570e29af96f42
      Signed-off-by: default avatarShawn O. Pearce <spearce@spearce.org>
      c0f09389
    • Shawn Pearce's avatar
      Reduce multi-level buffered streams in transport code · 2156aa89
      Shawn Pearce authored
      
      Some transports actually provide stream buffering on their own,
      without needing to be wrapped up inside of a BufferedInputStream in
      order to smooth out system calls to read or write.  A great example
      of this is the JSch SSH client, or the Apache MINA SSHD server.
      Both use custom buffering to packetize the streams into the encrypted
      SSH channel, and wrapping them up inside of a BufferedInputStream
      or BufferedOutputStream is relatively pointless.
      
      Our SideBandOutputStream implementation also provides some fairly
      large buffering, equal to one complete side-band packet on the main
      data channel.  Wrapping that inside of a BufferedOutputStream just to
      smooth out small writes from PackWriter causes extra data copies, and
      provides no advantage.  We can save some memory and some CPU cycles
      by letting PackWriter dump directly into the SideBandOutputStream's
      internal buffer array.
      
      Instead we push the buffering streams down to be as close to the
      network socket (or operating system pipe) as possible.  This allows
      us to smooth out the smaller reads/writes from pkt-line messages
      during advertisement and negotation, but avoid copying altogether
      when the stream switches to larger writes over a side band channel.
      
      Change-Id: I2f6f16caee64783c77d3dd1b2a41b3cc0c64c159
      Signed-off-by: default avatarShawn O. Pearce <spearce@spearce.org>
      2156aa89
    • Shawn Pearce's avatar
      Fix smart HTTP client buffer alignment · 882d03f7
      Shawn Pearce authored
      
      This proved to be a pretty difficult to find bug.  If we read exactly
      the number of response bytes from the UnionInputStream and didn't
      try to read beyond that length, the last connection's InputStream is
      still inside of the UnionInputStream, and UnionInputStream.isEmpty()
      returns false.  But there is no data present, so the next read
      request to our UnionInputStream returns EOF at a point where the
      HTTP client code should have started a new request in order to get
      more data.
      
      Instead of wrapping the UnionInputStream, push an dummy stream onto
      the end of it which when invoked always starts the next request and
      then returns EOF.  The UnionInputStream will automatically pop that
      dummy stream out, and then read the next request's stream.
      
      This way we never get into the state where we don't think we need
      to run another request in order to satisfy the current read request,
      but we really do.
      
      The bug was hidden for so long because BasePackConnection.init()
      was always wrapping the InputStream into a BufferedInputStream
      with an 8 KiB buffer.  This made the odds of us reading from the
      UnionInputStream the exact number of available bytes quite low, as
      the BufferedInputStream would always try to read a full buffer size.
      
      Change-Id: I02b5ec3ef6853688687d91de000a5fbe2354915d
      Signed-off-by: default avatarShawn O. Pearce <spearce@spearce.org>
      882d03f7
    • Shawn Pearce's avatar
      Use "ERR message" for early ReceivePack problems · d8c3e98d
      Shawn Pearce authored
      
      If the application wants to, it can use sendError(String) to send one
      or more error messages to clients before the advertisements are sent.
      These will cause a C Git client to break out of the advertisement
      parsing loop, display "remote error: message\n", and terminate.
      
      Servers can optionally use this to send a detailed error to a client
      explaining why it cannot use the ReceivePack service on a repository.
      Over smart HTTP these errors are sent in a 200 OK response, and
      are in the payload, allowing the Git client to give the end-user
      the custom message rather than the generic error "403 Forbidden".
      
      Change-Id: I03f4345183765d21002118617174c77f71427b5a
      Signed-off-by: default avatarShawn O. Pearce <spearce@spearce.org>
      d8c3e98d
    • Shawn Pearce's avatar
      Catch and report "ERR message" during remote advertisements · 1f4a30b8
      Shawn Pearce authored
      
      GitHub broke the native git protocol a while ago by interjecting an
      "ERR message" line into the upload-pack or receive-pack advertisement
      list.  This didn't match the expected pattern, so it caused existing
      C Git clients to abort with a protocol exception.
      
      These days, C Git clients actually look for this message and abort
      with a more graceful notice to the end-user.  JGit should do the
      same, including setting up a custom exception type that makes it
      easier for higher-level UIs to identify a message from the remote
      site and present it to the user.
      
      Change-Id: I51ab62a382cfaf1082210e8bfaa69506fd0d9786
      Signed-off-by: default avatarShawn O. Pearce <spearce@spearce.org>
      1f4a30b8
    • Shawn Pearce's avatar
      Wait for EOF on stderr before finishing SSH channel · 243b0d64
      Shawn Pearce authored
      
      JSch will allow us to close the connection and then just drop
      any late messages coming over the stderr stream for the command.
      This makes it easy to lose final output on a command, like from
      Gerrit Code Review's post receive hook.
      
      Instead spawn a background thread to copy data from JSch's pipe
      into our own buffer, and wait for that thread to receive EOF on the
      pipe before we declare the connection closed. This way we don't
      have a race condition between the stderr data arriving and JSch
      just tearing down the channel.
      
      Change-Id: Ica1ba40ed2b4b6efb7d5e4ea240efc0a56fb71f6
      Signed-off-by: default avatarShawn O. Pearce <spearce@spearce.org>
      243b0d64
    • Shawn Pearce's avatar
      Capture non-progress side band #2 messages and put in result · 673b3984
      Shawn Pearce authored
      
      Any messages received on side band #2 that aren't scraped as a
      progress message into our ProgressMonitor are now forwarded to a
      buffer which is later included into the OperationResult object.
      Application callers can use this buffer to present the additional
      messages from the remote peer after the push or fetch operation
      has concluded.
      
      The smart push connections using the native send-pack/receive-pack
      protocol now request side-band-64k capability if it is available
      and forward any messages received through that channel onto this
      message buffer.  This makes hook messages available over smart HTTP,
      or even over SSH.
      
      The SSH transport was modified to redirect the remote command's
      stderr stream into the message buffer, interleaved with any data
      received over side band #2.  Due to buffering between these two
      different channels in the SSH channel mux itself the order of any
      writes between the two cannot be ensured, but it tries to stay close.
      
      The local fork transport was also modified to redirect the local
      receive-pack's stderr into the message buffer, rather than going to
      the invoking JVM's System.err.  This gives applications a chance
      to log the local error messages, rather than needing to redirect
      their JVM's stderr before startup.
      
      To keep things simple, the application has to wait for the entire
      operation to complete before it can see the messages.  This may
      be a downside if the user is trying to debug a remote hook that is
      blocking indefinitely, the user would need to abort the connection
      before they can inspect the message buffer in any sort of UI built
      on top of JGit.
      
      Change-Id: Ibc215f4569e63071da5b7e5c6674ce924ae39e11
      Signed-off-by: default avatarShawn O. Pearce <spearce@spearce.org>
      673b3984
    • Shawn Pearce's avatar
      ReceivePack: Enable side-band-64k capability for status reports · d33f939e
      Shawn Pearce authored
      
      We now advertise the side-band-64k capability inside of ReceivePack,
      allowing hooks to echo status messages down the side band channel
      instead of over the optional stderr stream.
      
      This change permits hooks running inside of an http:// based push
      invocation to still message the end-user with more detailed errors
      than the small per-command string in the status report.
      
      Change-Id: I64f251ef2d13ab3fd0e1a319a4683725455e5244
      Signed-off-by: default avatarShawn O. Pearce <spearce@spearce.org>
      d33f939e
    • Shawn Pearce's avatar
      Use more restrictive patterns for sideband progress scraping · 4c44810d
      Shawn Pearce authored
      
      To avoid scraping a non-progress message as though it were a progress
      item for the progress monitor, use a more restrictive pattern to
      watch the remote side's messages.  These two regexps should match
      any message produced by C Git since 42e18fbf5f94 ("more compact
      progress display", Oct 2007), and which first appeared in Git 1.5.4.
      
      Change-Id: I57e34cf59d42c1dbcbd1a83dd6f499ce5e39d15d
      Signed-off-by: default avatarShawn O. Pearce <spearce@spearce.org>
      4c44810d
    • Shawn Pearce's avatar
      Prefix remote progress tasks with "remote: " · 3a9295b8
      Shawn Pearce authored
      
      When we pull task messages off the remote peer via sideband #2
      prefix them with the string "remote: " to make it clear to the
      user these are coming from the other system, and not from their
      local client.
      
      Change-Id: I02c5e67c6be67e30e40d3bc4be314d6640feb519
      Signed-off-by: default avatarShawn O. Pearce <spearce@spearce.org>
      3a9295b8
    • Shawn Pearce's avatar
      Decode side-band channel number as unsigned integer · b7e8cefc
      Shawn Pearce authored
      
      This field is unsigned in the protocol, so treat it
      as such when we report the channel number in errors.
      
      Change-Id: I20a52809c7a756e9f66b3557a4300ae1e11f6d25
      Signed-off-by: default avatarShawn O. Pearce <spearce@spearce.org>
      b7e8cefc
    • Shawn Pearce's avatar
      Refactor SideBandInputStream construction · f2dc9f0b
      Shawn Pearce authored
      
      Typically we refer to the raw InputStream (the stream without the
      pkt-line headers on it) as rawIn, and the pkt-line header variant
      as pckIn.  Refactor our fields to reflect that.  To ensure these
      are actually the same underlying InputStream, we now create our own
      PacketLineIn wrapper around the supplied raw InputStream.  Its a
      very low-cost object since it has only the 4 byte length buffer.
      
      Instead of hardcoding the header length as 5, use the constant from
      SideBandOutputStream.  This makes it a bit more clear what we are
      consuming, exactly here.
      
      Change-Id: Iebd05538042913536b88c3ddc3adc3a86a841cc5
      Signed-off-by: default avatarShawn O. Pearce <spearce@spearce.org>
      f2dc9f0b
    • Shawn Pearce's avatar
      Refactor SideBandOutputStream to be buffered · 0af5944c
      Shawn Pearce authored
      
      Instead of relying on our callers to wrap us up inside of a
      BufferedOutputStream and using the proper block sizing, do the
      buffering directly inside of SideBandOutputStream.  This ensures
      we don't get large write-throughs from BufferedOutputStream that
      might overflow the configured packet size.
      
      The constructor of SideBandOutputStream is also beefed up to check
      its arguments and ensure they are within acceptable ranges for the
      current side-band protocol.
      
      Change-Id: Ic14567327d03c9e972f9734b8228178bc448867d
      Signed-off-by: default avatarShawn O. Pearce <spearce@spearce.org>
      0af5944c
  3. Mar 12, 2010
  4. Mar 11, 2010
  5. Mar 10, 2010
  6. Mar 08, 2010
    • Matthias Sohn's avatar
      Script to fix license headers and copyrights in Java sources · 57822587
      Matthias Sohn authored
      
      The script merges explicit copyright statements in all Java
      sources with author information from git history, updates the
      copyright headers accordingly, and updates the license headers
      to EDL.  For recognized copyright formats see the test data in
      tools/fix-headers.tst.
      
      To fix headers only in the current working directory:
      
        ./tools/fix-headers.pl
      
      To fix the headers for all revisions (don't do this if you don't
      understand the implications of rewriting history) run:
      
        ./tools/rewrite-history.sh
      
      Authors are mapped to employer copyright statements through a
      hardcoded table in the top of the script.  This is a crude but
      simple way to list date ranges under which certain changes need
      to be attributed to copyright holders other than the author.
      
      Change-Id: I654d758658cded02d91324c385f336bcc57fd85f
      Signed-off-by: default avatarMatthias Sohn <matthias.sohn@sap.com>
      Signed-off-by: default avatarShawn O. Pearce <spearce@spearce.org>
      57822587
  7. Feb 17, 2010
  8. Feb 12, 2010
  9. Feb 11, 2010
    • Nico Sallembien's avatar
      Allow users of ReceivePack access to the objects being sent · 19126f70
      Nico Sallembien authored
      When implementing branch read access, we need to prove that the
      newly created reference(s) point to objects that the user can see.
      
      There are two ways that an object is reachable:
      1)  It's reachable from a branch or change the user can see
      2)  It was uploaded as part of the pack file the user sent us
      
      This change adds additional methods in ReceivePack that will allow a
      server to check the above conditions, in order to ensure that a user
      is not trying to create a reference that they cannot see, or that a
      malicious user isn't attempting to forge the SHA-1 of an object that
      they cannot see in order to base a change off of it.
      
      Change-Id: Ieba75b4f0331e06a03417c37f4ae1ebca4fbee5a
      19126f70
    • Shawn Pearce's avatar
      Don't doubly wrap TransportException in smart HTTP client · dd931bd9
      Shawn Pearce authored
      
      If the readAdvertisedRefs() method throws an exception, its already
      closed the connection and wrapped the underlying cause inside of a
      suitable TransportException object that it is throwing.  We shouldn't
      catch IOException and rethrow a wrapped copy here, because we'll double
      wrap the exception thrown by readAdvertisedRefs.  This may obsecure the
      root cause of the connection failure from the end-user.
      
      Change-Id: I0ca61560f9888c666323dac8a5582aab25e897ff
      Signed-off-by: default avatarShawn O. Pearce <spearce@spearce.org>
      dd931bd9
  10. Feb 10, 2010
  11. Feb 08, 2010
Loading