Skip to content
Snippets Groups Projects
  1. Apr 17, 2010
    • Shawn Pearce's avatar
      ReceivePack: Clarify the check reachable option · 585dcb7a
      Shawn Pearce authored
      
      This option was mis-named from day 1.  Its not checking that the
      objects provided by the client are reachable, its actually doing
      a scan to prove that objects referenced by the client are already
      reachable through another reference on the server, or were sent
      as part of the pack from the client.
      
      Rename it checkReferencedObjectsAreReachable, since we really are
      trying to validate that objects referenced by the client's actions
      are reachable to the client.
      
      We also need to ensure we run checkConnectivity() anytime this is
      enabled, even if the caller didn't turn on fsck for object formats.
      Otherwise the check would be completely bypassed.
      
      Change-Id: Ic352ddb0ca8464d407c6da5c83573093e018af19
      Signed-off-by: default avatarShawn O. Pearce <spearce@spearce.org>
      585dcb7a
    • Shawn Pearce's avatar
      ReceivePack: Micro-optimize object lookup when checking connectivity · a7702050
      Shawn Pearce authored
      
      If we are checking the visibility of everything referenced in the
      pack that isn't already reachable by a reference, it needs to be
      in the provided set.  Since the provided set lists everything that
      is in this pack, we can avoid checking to see if the blob exists
      on disk, because we know it should be there, it was found in the
      pack we just consumed.
      
      Change-Id: Ie3c7746f734d13077242100a68e048f1ac18c34a
      Signed-off-by: default avatarShawn O. Pearce <spearce@spearce.org>
      a7702050
    • Shawn Pearce's avatar
      ReceivePack: Correct type of not provided object · 6029bb24
      Shawn Pearce authored
      
      If a tree was referenced but not provided in the pack, report it
      as a missing tree and not as a missing blob.
      
      Change-Id: Iab05705349cdf0d30cc3f8afc6698a8d2a941343
      Signed-off-by: default avatarShawn O. Pearce <spearce@spearce.org>
      6029bb24
    • Shawn Pearce's avatar
      IndexPack: Tighten up new and base object bookkeeping · 2bb8defa
      Shawn Pearce authored
      
      The only current consumer of these collections is ReceivePack,
      where it needs to test ObjectId equality between a RevObject and an
      ObjectId.  There we were copying from a traditional HashSet<ObjectId>
      into an ObjectIdSubclassMap<ObjectId>, as the latter can perform
      hashing using ObjectId's native value support, bypassing RevObject's
      override on hashCode() and equals().  Instead of doing that copy,
      directly create ObjectIdSubclassMap instances inside of ReceivePack.
      
      We also only need to record the objects that do not appear in the
      incoming pack, and were therefore copied from the local repositiory
      in order to complete delta resolution.  Instead of listing everything
      that used an OBJ_REF_DELTA format, list only the objects that we
      pulled from the destination repository via a normal ObjectLoader.
      
      ReceivePack can now discard the IndexPack object, and all of its
      other data, as soon as these collections are held by the check
      connectivity method.  This frees up memory for the ObjectWalk's
      own RevObject pool.
      
      Change-Id: I22ef71b45c2045a0202e7fd550a770ee1f6f38a6
      Signed-off-by: default avatarShawn O. Pearce <spearce@spearce.org>
      2bb8defa
  2. Apr 16, 2010
    • Shawn Pearce's avatar
      ReceivePack: Remove need new,base object id properties · 329a0e16
      Shawn Pearce authored
      
      These are more like internal implementation details of how IndexPack
      works with ReceivePack to validate the incoming object stream.
      Callers who are embedding the ReceivePack logic in their own
      application don't really need to know the details of which objects
      were used for delta bases in the incoming thin pack, or exactly
      which objects were newly transmitted.
      
      Hide these from the API, as exposing them through ReceivePack was
      an early mistake.
      
      Change-Id: I7ee44a314fa19e6a8520472ce05de92c324ad43e
      Signed-off-by: default avatarShawn O. Pearce <spearce@spearce.org>
      329a0e16
    • Shawn Pearce's avatar
      ReceivePack: Discard IndexPack as soon as possible · 8279361d
      Shawn Pearce authored
      
      The IndexPack object carries a good bit of state within itself about
      the objects received over the wire.  The earlier we can discard it,
      the sooner the GC is able to reclaim this chunk of memory for other
      uses.  So drop it as soon as we are certain the pack is valid and we
      have no connectivity concerns.
      
      Change-Id: I1e8bc87c2e9183733043622237a064e55957891f
      Signed-off-by: default avatarShawn O. Pearce <spearce@spearce.org>
      8279361d
    • Shawn Pearce's avatar
      ReceivePack: fix ensureProvidedObjectsVisible on thin packs · 7a91b180
      Shawn Pearce authored
      
      If ensureProvidedObjectsVisible is enabled we expected any trees or
      blobs directly reachable from an advertised reference to be marked
      with UNINTERESTING.  Unfortunately ObjectWalk doesn't bother setting
      this until the traversal is complete.  Even then it won't necessarily
      set it on every tree if the corresponding commit wasn't popped.
      
      When we are going to check the base objects for the received pack,
      ensure the UNINTERESTING flag gets carried into every immediately
      reachable tree or blob, because these are the ones that the client
      might try to use as delta bases in a thin pack.
      
      Change-Id: I5d5fdcf07e25ac9fc360e79a25dff491925e4101
      Signed-off-by: default avatarShawn O. Pearce <spearce@spearce.org>
      7a91b180
    • Shawn Pearce's avatar
      ObjectIdSubclassMap: Correct Iterator to throw NoSuchElementException · 466bec3c
      Shawn Pearce authored
      
      The Iterator contract says next() shall throw NoSuchElementException
      if there are no more items remaining in the iteration.  We got this
      wrong when I originally wrote the implementation, so fix it.
      
      Change-Id: Iea25e6569ead5c8b3128b8a368c5b2caebec7ecc
      Signed-off-by: default avatarShawn O. Pearce <spearce@spearce.org>
      466bec3c
    • Shawn Pearce's avatar
      ObjectIdSubclassMap: Add isEmpty() method · 4cc7b1c5
      Shawn Pearce authored
      
      This class behaves like a cross between a Set and a Map, sometimes
      we might expect to use the method isEmpty() to test for size() == 0.
      So implement it, reducing the surprise folks get when they are given
      one of these objects.
      
      Change-Id: I0d68e1243da8e62edf79c6ba4fd925f643e80a88
      Signed-off-by: default avatarShawn O. Pearce <spearce@spearce.org>
      4cc7b1c5
    • Shawn Pearce's avatar
      IndexPack: Correct thin pack fix using less than 20 bytes · 06ee913c
      Shawn Pearce authored
      
      If we need to append less than 20 bytes in order to fix a thin pack
      and make it complete, we need to set the length of our file back to
      the actual number of bytes used because the original SHA-1 footer was
      not completely overwritten.  That extra data will confuse the header
      and footer fixup logic when it tries to read to the end of the file.
      
      This isn't a very common case to occur, which is why we've never
      seen it before.  Getting a delta that requires a whole object which
      uses less than 20 bytes in pack representation is really hard.
      Generally a delta generator won't make these, because the delta
      would be bigger than simply deflating the whole object.  I only
      managed to do this with a hand-crafted pack file where a 1 byte
      delta was pointed to a 1 byte whole object.
      
      Normally we try really hard to avoid truncating, because its
      typically not safe across network filesystems.  But the odds of
      this occurring are very low.  This truncation is done on a file
      we have open for writing, will append more content onto, and is
      a temporary file that we won't move into position for others to
      see until we've validated its SHA-1 is sane.  I don't think the
      truncate on NFS issue is something we need to worry about here.
      
      Change-Id: I102b9637dfd048dc833c050890d142f43c1e75ae
      Signed-off-by: default avatarShawn O. Pearce <spearce@spearce.org>
      06ee913c
  3. Apr 13, 2010
  4. Apr 12, 2010
  5. Apr 11, 2010
  6. Apr 10, 2010
  7. Apr 05, 2010
    • Robin Rosenberg's avatar
      JGit plugin not compatible with Eclipse 3.4 · fa4c3fe4
      Robin Rosenberg authored
      
      The JSch bundle in Eclipse 3.4 does not export its packages with
      version numbers. Use Require-Bundle on version 0.1.37 that comes
      with Eclipse 3.4
      
      There is no 0.1.37 in the maven repositories so the pom still refers
      to 0.1.41 so the build can get the compile time dependencies right.
      
      Bug: 308031
      CQ: 3904 jsch Version: 0.1.37 (using Orbit CQ2014)
      
      Change-Id: I12eba86bfbe584560c213882ebba58bf1f9fa0c1
      Signed-off-by: default avatarRobin Rosenberg <robin.rosenberg@dewire.com>
      fa4c3fe4
  8. Mar 23, 2010
  9. Mar 22, 2010
  10. Mar 21, 2010
    • Shawn Pearce's avatar
      Fix EGit deadlock listing branches of SSH remote · 0dc93a2f
      Shawn Pearce authored
      
      When listing branches, EGit only reads the advertisement and
      then disconnects.  When it closes down the pack channel the remote
      side is waiting for the client to send our list of commands, or a
      flush-pkt to let it know there is nothing to do.
      
      However if an error thread is open watching the SSH stderr stream,
      we ask for it to finish before we send the flush-pkt.  Unfortunately
      the thread won't terminate until the main output stream closes,
      which is waiting for the flush-pkt.  A classic network deadlock.
      
      If the output stream needs a flush-pkt we send it before we wait
      for the error stream to close.  If the flush-pkt is rejected, we
      close down the output stream early, assuming that the remote side
      is broken and we will get error information soon.
      
      Change-Id: I8d078a339077756220c113f49d206b1bf295d434
      Signed-off-by: default avatarShawn O. Pearce <spearce@spearce.org>
      0dc93a2f
    • Shawn Pearce's avatar
      Merge branch 'stable-0.7' · 9285240d
      Shawn Pearce authored
      * stable-0.7:
        Qualify post-0.7.0 builds
        JGit 0.7.0
      
      This is an 'ours' merge to avoid bringing in the 0.7.0 version
      numbers in the manifest and pom files.
      
      Change-Id: Iad6354af57aaa2f233142fbf679489b08c121a71
      9285240d
    • Shawn Pearce's avatar
      Qualify builds as 0.8.0 · 14e469c4
      Shawn Pearce authored
      
      Since the API is changing relative to 0.7.0, we'll call our next
      release 0.8.1.  But until that gets released, builds from master
      will be 0.8.0.qualifier.
      
      Change-Id: I921e984f51ce498610c09e0db21be72a533fee88
      Signed-off-by: default avatarShawn O. Pearce <spearce@spearce.org>
      14e469c4
    • Shawn Pearce's avatar
      Merge branch 'stable-0.7' · 624572b6
      Shawn Pearce authored
      * stable-0.7:
        tools/version.sh: Update OSGi manifest files
        Drop CQ 3448 from IP log
      
      Change-Id: I8d78d27c48c16a70078bf76b255f8ade8e94db2a
      624572b6
    • Shawn Pearce's avatar
      Qualify post-0.7.0 builds · 7182fbc4
      Shawn Pearce authored
      
      Change-Id: I5afdc624b28fab37b28dd2cc71d334198672eef3
      Signed-off-by: default avatarShawn O. Pearce <spearce@spearce.org>
      7182fbc4
  11. Mar 19, 2010
  12. Mar 18, 2010
    • Nico Sallembien's avatar
      Add a paranoid 'must be provided' option to ReceivePack · 0f95d2d0
      Nico Sallembien authored
      
      By default a receive pack assumes that its user will only provide
      references to objects that the user already has access to on their
      local client.  In certain cases, an additional check to verify the
      references point only to reachable objects is necessary.
      
      This additional checking is useful when the code doesn't trust
      the client not to provide a forged SHA-1 reference to an object,
      in an attempt to access parts of the DAG that they weren't allowed
      to see by the configured RefFilter.
      
      Change-Id: I3e4b8505cb2992e3e4be253abb14a1501e47b970
      Signed-off-by: default avatarShawn O. Pearce <spearce@spearce.org>
      0f95d2d0
  13. Mar 13, 2010
    • Shawn Pearce's avatar
      Merge branch 'stable-0.7' · 6fabb6d2
      Shawn Pearce authored
      * stable-0.7:
        Reuse the line buffer between strings in PacketLineIn
        http.server: Use TemporaryBuffer and compress some responses
        Reduce multi-level buffered streams in transport code
        Fix smart HTTP client buffer alignment
        Use "ERR message" for early ReceivePack problems
        Catch and report "ERR message" during remote advertisements
        Wait for EOF on stderr before finishing SSH channel
        Capture non-progress side band #2 messages and put in result
        ReceivePack: Enable side-band-64k capability for status reports
        Use more restrictive patterns for sideband progress scraping
        Prefix remote progress tasks with "remote: "
        Decode side-band channel number as unsigned integer
        Refactor SideBandInputStream construction
        Refactor SideBandOutputStream to be buffered
      6fabb6d2
    • Shawn Pearce's avatar
      Merge branch 'push-sideband' into stable-0.7 · 23bd331c
      Shawn Pearce authored
      * push-sideband:
        Reuse the line buffer between strings in PacketLineIn
        http.server: Use TemporaryBuffer and compress some responses
        Reduce multi-level buffered streams in transport code
        Fix smart HTTP client buffer alignment
        Use "ERR message" for early ReceivePack problems
        Catch and report "ERR message" during remote advertisements
        Wait for EOF on stderr before finishing SSH channel
        Capture non-progress side band #2 messages and put in result
        ReceivePack: Enable side-band-64k capability for status reports
        Use more restrictive patterns for sideband progress scraping
        Prefix remote progress tasks with "remote: "
        Decode side-band channel number as unsigned integer
        Refactor SideBandInputStream construction
        Refactor SideBandOutputStream to be buffered
      
      Change-Id: Ic9689e64e8c87971f2fd402cb619082309d5587f
      23bd331c
    • Shawn Pearce's avatar
      Reuse the line buffer between strings in PacketLineIn · 89cdc3b7
      Shawn Pearce authored
      
      When reading pkt-lines off an InputStream we are quite likely to
      consume a whole group of fairly short lines in rapid succession, such
      as in the have exchange that occurs in the fetch-pack/upload-pack
      protocol.  Rather than allocating a throwaway buffer for each
      line's raw byte sequence, reuse a buffer that is equal to the small
      side-band packet size, which is 1000 bytes.  Text based pkt-lines
      are required to be less than this size because many widely deployed
      versions of C Git use a statically allocated array of this length.
      
      Change-Id: Ia5c8e95b85020f7f80b6d269dda5059b092d274d
      Signed-off-by: default avatarShawn O. Pearce <spearce@spearce.org>
      89cdc3b7
    • Shawn Pearce's avatar
      http.server: Use TemporaryBuffer and compress some responses · c0f09389
      Shawn Pearce authored
      
      The HTTP server side code now uses the same approach that the smart
      HTTP client code uses when preparing a request body.  The payload
      is streamed into a TemporaryBuffer of limited size.  If the entire
      data fits, its compressed with gzip if the user agent supports that,
      and a Content-Length header is used to transmit the fixed length
      body to the peer.  If however the data overflows the limited memory
      segment, its streamed uncompressed to the peer.
      
      One might initially think that larger contents which overflow
      the buffer should also be compressed, rather than sent raw, since
      they were deemed "large".  But usually these larger contents are
      actually a pack file which has been already heavily compressed by
      Git specific routines.  Trying to deflate that with gzip is probably
      going to take up more space, not less, so the compression overhead
      isn't worthwhile.
      
      This buffer and compress optimization helps repositories with a
      large number of references, as their text based advertisements
      compress well. For example jgit's own native repository currently
      requires 32,628 bytes for its full advertisement of 489 references.
      Most repositories have fewer references, and thus could compress
      their entire response in one buffer.
      
      Change-Id: I790609c9f763339e0a1db9172aa570e29af96f42
      Signed-off-by: default avatarShawn O. Pearce <spearce@spearce.org>
      c0f09389
    • Shawn Pearce's avatar
      Reduce multi-level buffered streams in transport code · 2156aa89
      Shawn Pearce authored
      
      Some transports actually provide stream buffering on their own,
      without needing to be wrapped up inside of a BufferedInputStream in
      order to smooth out system calls to read or write.  A great example
      of this is the JSch SSH client, or the Apache MINA SSHD server.
      Both use custom buffering to packetize the streams into the encrypted
      SSH channel, and wrapping them up inside of a BufferedInputStream
      or BufferedOutputStream is relatively pointless.
      
      Our SideBandOutputStream implementation also provides some fairly
      large buffering, equal to one complete side-band packet on the main
      data channel.  Wrapping that inside of a BufferedOutputStream just to
      smooth out small writes from PackWriter causes extra data copies, and
      provides no advantage.  We can save some memory and some CPU cycles
      by letting PackWriter dump directly into the SideBandOutputStream's
      internal buffer array.
      
      Instead we push the buffering streams down to be as close to the
      network socket (or operating system pipe) as possible.  This allows
      us to smooth out the smaller reads/writes from pkt-line messages
      during advertisement and negotation, but avoid copying altogether
      when the stream switches to larger writes over a side band channel.
      
      Change-Id: I2f6f16caee64783c77d3dd1b2a41b3cc0c64c159
      Signed-off-by: default avatarShawn O. Pearce <spearce@spearce.org>
      2156aa89
    • Shawn Pearce's avatar
      Fix smart HTTP client buffer alignment · 882d03f7
      Shawn Pearce authored
      
      This proved to be a pretty difficult to find bug.  If we read exactly
      the number of response bytes from the UnionInputStream and didn't
      try to read beyond that length, the last connection's InputStream is
      still inside of the UnionInputStream, and UnionInputStream.isEmpty()
      returns false.  But there is no data present, so the next read
      request to our UnionInputStream returns EOF at a point where the
      HTTP client code should have started a new request in order to get
      more data.
      
      Instead of wrapping the UnionInputStream, push an dummy stream onto
      the end of it which when invoked always starts the next request and
      then returns EOF.  The UnionInputStream will automatically pop that
      dummy stream out, and then read the next request's stream.
      
      This way we never get into the state where we don't think we need
      to run another request in order to satisfy the current read request,
      but we really do.
      
      The bug was hidden for so long because BasePackConnection.init()
      was always wrapping the InputStream into a BufferedInputStream
      with an 8 KiB buffer.  This made the odds of us reading from the
      UnionInputStream the exact number of available bytes quite low, as
      the BufferedInputStream would always try to read a full buffer size.
      
      Change-Id: I02b5ec3ef6853688687d91de000a5fbe2354915d
      Signed-off-by: default avatarShawn O. Pearce <spearce@spearce.org>
      882d03f7
    • Shawn Pearce's avatar
      Use "ERR message" for early ReceivePack problems · d8c3e98d
      Shawn Pearce authored
      
      If the application wants to, it can use sendError(String) to send one
      or more error messages to clients before the advertisements are sent.
      These will cause a C Git client to break out of the advertisement
      parsing loop, display "remote error: message\n", and terminate.
      
      Servers can optionally use this to send a detailed error to a client
      explaining why it cannot use the ReceivePack service on a repository.
      Over smart HTTP these errors are sent in a 200 OK response, and
      are in the payload, allowing the Git client to give the end-user
      the custom message rather than the generic error "403 Forbidden".
      
      Change-Id: I03f4345183765d21002118617174c77f71427b5a
      Signed-off-by: default avatarShawn O. Pearce <spearce@spearce.org>
      d8c3e98d
    • Shawn Pearce's avatar
      Catch and report "ERR message" during remote advertisements · 1f4a30b8
      Shawn Pearce authored
      
      GitHub broke the native git protocol a while ago by interjecting an
      "ERR message" line into the upload-pack or receive-pack advertisement
      list.  This didn't match the expected pattern, so it caused existing
      C Git clients to abort with a protocol exception.
      
      These days, C Git clients actually look for this message and abort
      with a more graceful notice to the end-user.  JGit should do the
      same, including setting up a custom exception type that makes it
      easier for higher-level UIs to identify a message from the remote
      site and present it to the user.
      
      Change-Id: I51ab62a382cfaf1082210e8bfaa69506fd0d9786
      Signed-off-by: default avatarShawn O. Pearce <spearce@spearce.org>
      1f4a30b8
    • Shawn Pearce's avatar
      Wait for EOF on stderr before finishing SSH channel · 243b0d64
      Shawn Pearce authored
      
      JSch will allow us to close the connection and then just drop
      any late messages coming over the stderr stream for the command.
      This makes it easy to lose final output on a command, like from
      Gerrit Code Review's post receive hook.
      
      Instead spawn a background thread to copy data from JSch's pipe
      into our own buffer, and wait for that thread to receive EOF on the
      pipe before we declare the connection closed. This way we don't
      have a race condition between the stderr data arriving and JSch
      just tearing down the channel.
      
      Change-Id: Ica1ba40ed2b4b6efb7d5e4ea240efc0a56fb71f6
      Signed-off-by: default avatarShawn O. Pearce <spearce@spearce.org>
      243b0d64
Loading