jjzjj

git - 是否有任何支持部分 checkout /克隆的分布式修订控制系统?

coder 2023-06-24 原文

据我所知,所有分布式版本控制系统都要求您克隆整个存储库。出于这个原因,将大量内容放在一个单一的存储库中是不明智的(感谢 this answer)。我知道这不是错误而是功能,但我想知道这是否是所有分布式修订控制系统的要求。

在分布式 rcs 中,文件(或内容块)的历史记录是一个有向无环图,那么为什么不能克隆这个单个 DAG 而不是存储库中所有图的集合?也许我错过了一些东西,但以下用例很难做到:

  • 仅克隆存储库的一部分
  • merge 两个存储库(保留它们的历史!)
  • 将一些带有历史记录的文件从一个存储库复制到另一个

  • 如果我从多个项目中重用其他人的部分代码,我将无法保留他们的完整历史记录。至少在 git 中,我可以想到一个(相当复杂的)解决方法:
  • 克隆一个完整的存储库
  • 删除所有我不感兴趣的内容
  • 重写历史以删除所有不在 master 中的内容
  • 将剩余的存储库 merge 到现有存储库中

  • 我不知道这对于 Mercurial 或 Bazaar 是否也可行,但至少它根本不容易。那么是否有任何分布式 rcs 设计支持部分结帐/克隆?它应该支持一个简单的命令来从一个存储库中获取单个文件及其历史记录并将其 merge 到另一个存储库中。这样你就不需要考虑如何将你的内容组织成存储库和子模块,但你可以根据需要愉快地拆分和 merge 存储库(极端情况是每个文件一个存储库)。

    最佳答案

    从 Git 2.17(2018 年第二季度,10 年后)开始,可以实现 Mercurial 计划实现的功能:“narrow clone ”,即您只检索特定子目录数据的克隆。
    这也称为“部分克隆”。
    这与现在的不同

  • shallow clone
  • 从另一个工作文件夹中的克隆存储库中复制您需要的内容。

  • 请参阅 commit 3aa6694commit aa57b87commit 35a7ae9commit 1e1e39bcommit acb0c57commit bc2d0c3commit 640d8b7commit 10ac85cJeff Hostetler ( jeffhostetler )(2017 年 12 月 8 日)。
    请参阅 commit a1c6d7ccommit c0c578bcommit 548719fcommit a174334commit 0b6069fJonathan Tan ( jhowtan ) (2017 年 12 月 8 日)。
    (由 Junio C Hamano -- gitster --commit 6bed209 中 merge ,2018 年 2 月 13 日)
    这是 tests for a partial clone :
    git clone --no-checkout --filter=blob:none "file://$(pwd)/srv.bare" pc1 
    

    还有其他 other commits involved in that implementation of a narrow/partial clone
    特别是 commit 8b4c010 :

    sha1_file: support lazily fetching missing objects


    Teach sha1_file to fetch objects from the remote configured in extensions.partialclone whenever an object is requested but missing.



    关于 Git 2.17/2.18 的警告:最近添加的“部分克隆”实验性功能在不应该启动时启动,即即使设置了 extensions.partialclone 也没有定义部分克隆过滤器。
    请参阅 commit cac1137Jonathan Tan ( jhowtan )(2018 年 6 月 11 日)。
    (由 Junio C Hamano -- gitster --commit 92e1bbc 中 merge ,2018 年 6 月 28 日)

    upload-pack: disable object filtering when disabled by config


    When upload-pack gained partial clone support (v2.17.0-rc0~132^2~12, 2017-12-08), it was guarded by the uploadpack.allowFilter config item to allow server operators to control when they start supporting it.

    That config item didn't go far enough, though: it controls whether the 'filter' capability is advertised, but if a (custom) client ignores the capability advertisement and passes a filter specification anyway, the server would handle that despite allowFilter being false.

    This is particularly significant if a security bug is discovered in this new experimental partial clone code.
    Installations without uploadpack.allowFilter ought not to be affected since they don't intend to support partial clone, but they would be swept up into being vulnerable.



    这在 Git 2.20(2018 年第二季度)中得到了增强,因为部分克隆中的“git fetch $repo $object”没有正确获取由 promise 包文件中的对象引用的请求对象,该对象已修复。
    请参阅 commit 35f9e3ecommit 4937291Jonathan Tan ( jhowtan )(2018 年 9 月 21 日)。
    (由 Junio C Hamano -- gitster --commit a1e9dff 中 merge ,2018 年 10 月 19 日)

    fetch: in partial clone, check presence of targets


    When fetching an object that is known as a promisor object to the local repository, the connectivity check in quickfetch() in builtin/fetch.c succeeds, causing object transfer to be bypassed.
    However, this should not happen if that object is merely promised and not actually present.

    Because this happens, when a user invokes "git fetch origin <sha-1>" on the command-line, the <sha-1> object may not actually be fetched even though the command returns an exit code of 0. This is a similar issue (but with a different cause) to the one fixed by a0c9016 ("upload-pack: send refs' objects despite "filter"", 2018-07-09, Git v2.19.0-rc0).

    Therefore, update quickfetch() to also directly check for the presence of all objects to be fetched.



    您可以使用 git rev-list --exclude-promisor-objects 列出部分克隆的对象,不包括“promisor”对象

    (For internal use only.) Prefilter object traversal at promisor boundary.
    This is used with partial clone.
    This is stronger than --missing=allow-promisor because it limits the traversal, rather than just silencing errors about missing objects.


    但请确保使用 Git 2.21(2019 年第一季度)以避免段错误。
    请参阅 commit 4cf6786Matthew DeVore ( matvore )(2018 年 12 月 5 日)。
    (由 Junio C Hamano -- gitster --commit c333fe7 中 merge ,2019 年 1 月 14 日)

    "git rev-list --exclude-promisor-objects" had to take an object that does not exist locally (and is lazily available) from the command line without barfing, but the code dereferenced NULL.

    list-objects.c :不要为缺少的 cmdline 对象设置段错误

    When a command is invoked with both --exclude-promisor-objects, --objects-edge-aggressive, and a missing object on the command line, the rev_info.cmdline array could get a NULL pointer for the value of an 'item' field.
    Prevent dereferencing of a NULL pointer in that situation.



    请注意,Git 2.21(2019 年第一季度)修复了一个错误:
    请参阅 commit bbcde41Matthew DeVore ( matvore )(2018 年 12 月 3 日)。
    (由 Junio C Hamano -- gitster --commit 6e5be1f 中 merge ,2019 年 1 月 14 日)

    exclude-promisor-objects: declare when option is allowed


    The --exclude-promisor-objects option causes some funny behavior in at least two commands: log and blame.
    It causes a BUG crash:

    $ git log --exclude-promisor-objects
    BUG: revision.c:2143: exclude_promisor_objects can only be used
    when fetch_if_missing is 0
    Aborted
    [134]
    

    Fix this such that the option is treated like any other unknown option.
    The commands that must support it are limited, so declare in those commands that the flag is supported.
    In particular:

    pack-objects
    prune
    rev-list
    

    The commands were found by searching for logic which parses --exclude-promisor-objects outside of revision.c.
    Extra logic outside of revision.c is needed because fetch_if_missing must be turned on before revision.c sees the option or it will BUG-crash. The above list is supported by the fact that no other command is introspectively invoked by another command passing --exclude-promisor-object.



    Git 2.22(2019 年第二季度)优化了窄克隆:
    在惰性克隆中运行“git diff ”时,我们可以预先知道哪个
    缺少我们需要的 blob,而不是等待按需
    一一发现它们的机器。
    旨在通过批处理对这些 promise 的 blob 的请求来实现更好的性能。
    请参阅 commit 7fbbcb2(2019 年 4 月 5 日)和 commit 0f4a4fb(2019 年 3 月 29 日)的 Jonathan Tan ( jhowtan )
    (由 Junio C Hamano -- gitster --commit 32dc15d 中 merge ,2019 年 4 月 25 日)

    diff: batch fetching of missing blobs


    When running a command like "git show" or "git diff" in a partial clone, batch all missing blobs to be fetched as one request.

    This is similar to c0c578b ("unpack-trees: batch fetching of missing blobs", 2017-12-08, Git v2.17.0-rc0), but for another command.



    Git 2.23(2019 年第 3 季度)将证明批量丢失 blob 部分的 future 。
    请参阅 commit 31f5256Derrick Stolee ( derrickstolee )(2019 年 5 月 28 日)。
    (由 Junio C Hamano -- gitster -- merge 于 commit 5d5c46b ,2019 年 6 月 17 日)

    sha1-file: split OBJECT_INFO_FOR_PREFETCH


    The OBJECT_INFO_FOR_PREFETCH bitflag was added to sha1-file.c in 0f4a4fb (sha1-file: support OBJECT_INFO_FOR_PREFETCH, 2019-03-29, Git v2.22.0-rc0) and is used to prevent the fetch_objects() method when enabled.

    However, there is a problem with the current use.
    The definition of OBJECT_INFO_FOR_PREFETCH is given by adding 32 to OBJECT_INFO_QUICK.
    This is clearly stated above the definition (in a comment) that this is so OBJECT_INFO_FOR_PREFETCH implies OBJECT_INFO_QUICK.
    The problem is that using "flag & OBJECT_INFO_FOR_PREFETCH" means that OBJECT_INFO_QUICK also implies OBJECT_INFO_FOR_PREFETCH.

    Split out the single bit from OBJECT_INFO_FOR_PREFETCH into a new OBJECT_INFO_SKIP_FETCH_OBJECT as the single bit and keep OBJECT_INFO_FOR_PREFETCH as the union of two flags.


    并且“git fetch ”变成了一个懒惰的克隆忘记获取基础对象
    需要在一个瘦包文件中完成增量,这已经
    更正。
    请参阅 commit 810e193commit 5718c53commit 8a30a1e(2019 年 6 月 11 日)和 commit 385d1bfJonathan Tan ( jhowtan )(2019 年 5 月 14 日)。
    (由 Junio C Hamano -- gitster --commit 8867aa8 中 merge ,2019 年 6 月 21 日)

    index-pack: prefetch missing REF_DELTA bases


    When fetching, the client sends "have" commit IDs indicating that the server does not need to send any object referenced by those commits, reducing network I/O.
    When the client is a partial clone, the client still sends "have"s in this way, even if it does not have every object referenced by a commit it sent as "have".

    If a server omits such an object, it is fine: the client could lazily fetch that object before this fetch, and it can still do so after.

    The issue is when the server sends a thin pack containing an object that is a REF_DELTA against such a missing object: index-pack fails to fix the thin pack.
    When support for lazily fetching missing objects was added in 8b4c010 ("sha1_file: support lazily fetching missing objects", 2017-12-08, Git v2.17.0-rc0), support in index-pack was turned off in the belief that it accesses the repo only to do hash collision checks.
    However, this is not true: it also needs to access the repo to resolve REF_DELTA bases.

    Support for lazy fetching should still generally be turned off in index-pack because it is used as part of the lazy fetching process itself (if not, infinite loops may occur), but we do need to fetch the REF_DELTA bases.
    (When fetching REF_DELTA bases, it is unlikely that those are REF_DELTA themselves, because we do not send "have" when making such fetches.)

    To resolve this, prefetch all missing REF_DELTA bases before attempting to resolve them.
    This both ensures that all bases are attempted to be fetched, and ensures that we make only one request per index-pack invocation, and not one request per missing object.



    Git 2.24(2019 年第 4 季度)修复了惰性克隆中的按需对象获取,它错误地尝试从子模块项目获取提交,同时仍在 super 项目中工作。
    请参阅 commit a63694fJonathan Tan ( jhowtan )(2019 年 8 月 20 日)。
    (由 Junio C Hamano -- gitster --commit d8b1ce7 中 merge ,2019 年 9 月 9 日)

    diff: skip GITLINK when lazy fetching missing objs


    In 7fbbcb2 ("diff: batch fetching of missing blobs", 2019-04-08, Git v2.22.0-rc0), diff was taught to batch the fetching of missing objects when operating on a partial clone, but was not taught to refrain from fetching GITLINKs.
    Teach diff to check if an object is a GITLINK before including it in the set to be fetched.



    Git 2.24(2019 年第四季度)还引入了 promisor 远程存储库的概念。
    commit 4ca9474 , commit 60b7a92 , commit db27dca , commit 75de085 , commit 7e154ba , commit 9a4c507 , commit 5e46139 , commit fa3d1b6 , commit b14ed5a , commit faf2abf , commit 9cfebc1 , commit 9e27bea, commit 48de315, commit 2e86067, commit c59c7c8, Christian Couder ( chriscool ), Junio C Hamano -- gitster --, _jit_a, _jit_a, _jit_a, _jit_a, commit b9ac6c5, _jit_a, partial-clone documentation , _20
    (由 commit 90d21f9commit 5a133e8 中 merge ,2019 年 9 月 18 日)
    commit 489fc9e 将promisor repo 定义为:

    A remote that can later provide the missing objects is called a promisor remote, as it promises to send the objects when requested.

    Initialy Git supported only one promisor remote, the origin remote from which the user cloned and that was configured in the "extensions.partialClone" config option.
    Later support for more than one promisor remote has been implemented.

    Many promisor remotes can be configured and used.

    This allows for example a user to have multiple geographically-close cache servers for fetching missing blobs while continuing to do filtered git-fetch commands from the central server.

    Remotes that are considered "promisor" remotes are those specified by the following configuration variables:

    • extensions.partialClone = <name>
    • remote.<name>.promisor = true
    • remote.<name>.partialCloneFilter = ...

    Only one promisor remote can be configured using the extensions.partialClone config variable. This promisor remote will be the last one tried when fetching objects.



    Git 2.24(2019 年第四季度)还在部分克隆中改进了 过滤器 的概念。
    commit c269495commit cf9ceb5commit f56f764commit e987df5commit 842b005commit 7a7c7f4commit 9430147Matthew DeVore ( matvore )Junio C Hamano -- gitster --commit 627b826 (2019 年 6 月 27 日)作者 commit 95acf11
    (由 commit c14b6f8commit 1c37e86 中 merge ,2019 年 9 月 18 日)
    它允许:
    • combining filters such that only objects accepted by all filters are shown.
      The motivation for this is to allow getting directory listings without also fetching blobs. This can be done by combining blob:none with tree:<depth>.
      There are massive repositories that have larger-than-expected trees - even if you include only a single commit.

    A combined filter supports any number of subfilters, and is written in the following form:

    combine:<filter 1>+<filter 2>+<filter 3>
    
    • combining of multiple filters by simply repeating the --filter flag.
      Before, the user had to combine them in a single flag somewhat awkwardly (e.g. --filter=combine:FOO+BAR), including URL-encoding the individual filters.


    在 Git 2.27(2020 年第二季度)中,部分克隆中的“git diff”学会了在更多不需要 blob 对象的情况下避免延迟加载它们。
    请参阅 commit db7ed74Jonathan Tan ( jhowtan )Junio C Hamano -- gitster --commit 8f5dc5a(2020 年 4 月 7 日)和 commit 23547c4(2020 年 4 月 2 日)。
    (由 commit 625e7f1Jonathan Tan ( jhowtan ) merge ,2020 年 4 月 28 日)

    diff: restrict when prefetching occurs

    Helped-by: Jeff King
    Signed-off-by: Jonathan Tan


    Commit 7fbbcb21b1 ("diff: batch fetching of missing blobs", 2019-04-08, Git v2.22.0-rc0 -- merge listed in batch #7) optimized "diff" by prefetching blobs in a partial clone, but there are some cases wherein blobs do not need to be prefetched.
    In these cases, any command that uses the diff machinery will unnecessarily fetch blobs.


    diffcore_std() may read blobs when it calls the following functions:

    1. diffcore_skip_stat_unmatch() (controlled by the config variable diff.autorefreshindex)
    2. diffcore_break() and diffcore_merge_broken() (for break-rewrite detection)
    3. diffcore_rename() (for rename detection)
    4. diffcore_pickaxe() (for detecting addition/deletion of specified string)

    Instead of always prefetching blobs, teach diffcore_skip_stat_unmatch(), diffcore_break(), and diffcore_rename() to prefetch blobs upon the first read of a missing object.
    This covers (1), (2), and (3): to cover the rest, teach diffcore_std() to prefetch if the output type is one that includes blob data (and hence blob data will be required later anyway), or if it knows that (4) will be run.



    请注意,内部进行的延迟获取以使部分克隆中丢失的对象可用,这错误地对存储库中的部分克隆过滤器造成了永久性损坏,这已在 Git 2.29(2020 年第四季度)中得到纠正。
    请参阅 Junio C Hamano -- gitster --commit e68f0a4(2020 年 9 月 28 日)和 ojit_a(2020 年 9 月 21 日)。
    (由 ojit_a 在 ojit_a 中 merge ,2020 年 10 月 5 日)

    fetch: do not override partial clone filter

    Signed-off-by: Jonathan Tan


    When a fetch with the --filter argument is made, the configured default filter is set even if one already exists. This change was made in 5e46139376 ("builtin/fetch: remove unique promisor remote limitation", 2019-06-25, Git v2.24.0-rc0 -- merge listed in batch #3) - in particular, changing from:

    • If this is the FIRST partial-fetch request, we enable partial
    • on this repo and remember the given filter-spec as the default
    • for subsequent fetches to this remote.

    to:

    • If this is a partial-fetch request, we enable partial on
    • this repo if not already enabled and remember the given
    • filter-spec as the default for subsequent fetches to this
    • remote.

    (The given filter-spec is "remembered" even if there is already an existing one.)

    This is problematic whenever a lazy fetch is made, because lazy fetches are made using "git fetch --filter=blob:none(man), but this will also happen if the user invokes "git fetch --filter=<filter>(man)" manually. Therefore, restore the behavior prior to 5e46139376, which writes a filter-spec only if the current fetch request is the first partial-fetch one (for that remote).

    关于git - 是否有任何支持部分 checkout /克隆的分布式修订控制系统?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/3098029/

    有关git - 是否有任何支持部分 checkout /克隆的分布式修订控制系统?的更多相关文章

    1. ruby-on-rails - 如何验证 update_all 是否实际在 Rails 中更新 - 2

      给定这段代码defcreate@upgrades=User.update_all(["role=?","upgraded"],:id=>params[:upgrade])redirect_toadmin_upgrades_path,:notice=>"Successfullyupgradeduser."end我如何在该操作中实际验证它们是否已保存或未重定向到适当的页面和消息? 最佳答案 在Rails3中,update_all不返回任何有意义的信息,除了已更新的记录数(这可能取决于您的DBMS是否返回该信息)。http://ar.ru

    2. ruby - 如何将脚本文件的末尾读取为数据文件(Perl 或任何其他语言) - 2

      我正在寻找执行以下操作的正确语法(在Perl、Shell或Ruby中):#variabletoaccessthedatalinesappendedasafileEND_OF_SCRIPT_MARKERrawdatastartshereanditcontinues. 最佳答案 Perl用__DATA__做这个:#!/usr/bin/perlusestrict;usewarnings;while(){print;}__DATA__Texttoprintgoeshere 关于ruby-如何将脚

    3. ruby - 检查数组是否在增加 - 2

      这个问题在这里已经有了答案:Checktoseeifanarrayisalreadysorted?(8个答案)关闭9年前。我只是想知道是否有办法检查数组是否在增加?这是我的解决方案,但我正在寻找更漂亮的方法:n=-1@arr.flatten.each{|e|returnfalseife

    4. ruby-on-rails - link_to 不显示任何 rails - 2

      我试图在索引页中创建一个超链接,但它没有显示,也没有给出任何错误。这是我的index.html.erb代码。ListingarticlesTitleTextssss我检查了我的路线,我认为它们也没有问题。PrefixVerbURIPatternController#Actionwelcome_indexGET/welcome/index(.:format)welcome#indexarticlesGET/articles(.:format)articles#indexPOST/articles(.:format)articles#createnew_articleGET/article

    5. ruby - 检查字符串是否包含散列中的任何键并返回它包含的键的值 - 2

      我有一个包含多个键的散列和一个字符串,该字符串不包含散列中的任何键或包含一个键。h={"k1"=>"v1","k2"=>"v2","k3"=>"v3"}s="thisisanexamplestringthatmightoccurwithakeysomewhereinthestringk1(withspecialcharacterslike(^&*$#@!^&&*))"检查s是否包含h中的任何键的最佳方法是什么,如果包含,则返回它包含的键的值?例如,对于上面的h和s的例子,输出应该是v1。编辑:只有字符串是用户定义的。哈希将始终相同。 最佳答案

    6. ruby-on-rails - Ruby 检查日期时间是否为 iso8601 并保存 - 2

      我需要检查DateTime是否采用有效的ISO8601格式。喜欢:#iso8601?我检查了ruby​​是否有特定方法,但没有找到。目前我正在使用date.iso8601==date来检查这个。有什么好的方法吗?编辑解释我的环境,并改变问题的范围。因此,我的项目将使用jsapiFullCalendar,这就是我需要iso8601字符串格式的原因。我想知道更好或正确的方法是什么,以正确的格式将日期保存在数据库中,或者让ActiveRecord完成它们的工作并在我需要时间信息时对其进行操作。 最佳答案 我不太明白你的问题。我假设您想检查

    7. ruby - 检查日期是否在过去 7 天内 - 2

      我的日期格式如下:"%d-%m-%Y"(例如,今天的日期为07-09-2015),我想看看是不是在过去的七天内。谁能推荐一种方法? 最佳答案 你可以这样做:require"date"Date.today-7 关于ruby-检查日期是否在过去7天内,我们在StackOverflow上找到一个类似的问题: https://stackoverflow.com/questions/32438063/

    8. ruby - 如何验证 IO.copy_stream 是否成功 - 2

      这里有一个很好的答案解释了如何在Ruby中下载文件而不将其加载到内存中:https://stackoverflow.com/a/29743394/4852737require'open-uri'download=open('http://example.com/image.png')IO.copy_stream(download,'~/image.png')我如何验证下载文件的IO.copy_stream调用是否真的成功——这意味着下载的文件与我打算下载的文件完全相同,而不是下载一半的损坏文件?documentation说IO.copy_stream返回它复制的字节数,但是当我还没有下

    9. ruby-on-rails - RSpec:避免使用允许接收的任何实例 - 2

      我正在处理旧代码的一部分。beforedoallow_any_instance_of(SportRateManager).toreceive(:create).and_return(true)endRubocop错误如下:Avoidstubbingusing'allow_any_instance_of'我读到了RuboCop::RSpec:AnyInstance我试着像下面那样改变它。由此beforedoallow_any_instance_of(SportRateManager).toreceive(:create).and_return(true)end对此:let(:sport_

    10. ruby - 是否可以覆盖 gemfile 进行本地开发? - 2

      我们的git存储库中目前有一个Gemfile。但是,有一个gem我只在我的环境中本地使用(我的团队不使用它)。为了使用它,我必须将它添加到我们的Gemfile中,但每次我checkout到我们的master/dev主分支时,由于与跟踪的gemfile冲突,我必须删除它。我想要的是类似Gemfile.local的东西,它将继承从Gemfile导入的gems,但也允许在那里导入新的gems以供使用只有我的机器。此文件将在.gitignore中被忽略。这可能吗? 最佳答案 设置BUNDLE_GEMFILE环境变量:BUNDLE_GEMFI

    随机推荐