Start with an infinite loop of yarn command

Original link: https://4ark.me/post/yarn-cwd-issue.html

foreword

Recently, I have an idea, I hope to perform some actions similar to initialization and synchronization configuration when installing dependencies in any subpackage in a yarn workspace project.

However, during the operation, I encountered an interesting question about yarn --cwd , which was specially recorded, hoping to help those who came later.

What’s the problem

Let’s first explain the basic situation of our project. It is a monorepo project managed by yarn workspace. It uses yarn v1.22.11 version. The directory structure is roughly as follows:

 1
2
3
4
5
6
7
8
 monorepo
├── package.json
├── app-a
│ └── package.json
├── app-b
│ └── package.json
└── config
└── package.json

Both app-a and app-b use the shared package config :

 1
2
3
 "dependencies" : {
"@monorepo/config" : "../config" ,
}

We need to do some initialization in the preinstall hook in the package.json of the root directory:

 1
2
3
 "scripts" : {
"preinstall" : "./bin/init.sh" ,
}

At this point, executing yarn or yarn add <pkg-name> in the root directory will trigger the preinstall hook, but executing yarn in app-a will not trigger the preinstall hook in the root directory.

Therefore, we need to add this line to each subpackage, that is, execute the preinstall command in the root directory when each subpackage installs dependencies:

 1
2
3
 "scripts" : {
"preinstall" : "yarn --cwd ../preinstall" ,
}

So, a strange thing happened, when I executed yarn in app-a , it stayed in the stage of installing @monorepo/config , and my computer became obviously stuck, so I opened htop and took a look, good guy , the full screen is:

 1
 4ark 40987 26.3 0.5 409250368 78624 ?? R 8:36 0:00.09 PM /usr/ local /bin/node /usr/ local /bin/yarn --cwd ../preinstall

The CPU usage directly reached 100%, which scared me to kill these processes quickly:

 1
 ps aux | grep preinstall | awk '{print $2}' | xargs kill -9 

Analyze the reasons

After being frightened, let’s analyze the reason. Obviously, this command has fallen into an infinite loop, resulting in more and more processes, so I tried to manually execute yarn --cwd ../ preinstall in each subpackage and found that everything was normal. Where is the problem?

So I executed yarn again and copied the process information with the following command for analysis:

 1
 ps -ef | pbcopy

Then I verified the guess I just made, and it is indeed this command that is constantly triggering itself, resulting in an infinite loop:

 1
2
3
4
5
 UID PID PPID C STIME TTY TIME CMD
501 50399 50379 0 8:50 PM?? 0:00.10 /usr/local/bin/node /usr/local/bin/yarn --cwd ../preinstall
501 50400 50399 0 8:50 PM?? 0:00.11 /usr/local/bin/node /usr/local/bin/yarn --cwd ../preinstall
501 50401 50400 0 8:50 PM?? 0:00.11 /usr/local/bin/node /usr/local/bin/yarn --cwd ../preinstall
501 50402 50401 0 8:50 PM?? 0:00.12 /usr/local/bin/node /usr/local/bin/yarn --cwd ../preinstall

Since the commands executed by the three sub-packages are the same, it is not clear whether it is caused by a sub-package, so modify the command to distinguish:

 1
2
3
 "scripts" : {
"preinstall" : "echo app-a && yarn --cwd ../preinstall" ,
}

Then I found that the problem was in the config sub-package, so I removed the preinstall command of this sub-package, and sure enough, there is no such problem, which is very strange.

Is there something wrong with the --cwd ../ path? To verify, change the command to this:

 1
2
3
 "scripts" : {
"preinstall" : "pwd && yarn --cwd ../preinstall" ,
}

It is found that the pwd output is like this:

 1
 /4ark/projects/monorepo/app-a/node_modules/@monorepo/config

From the output here we found two problems, the first one is:

  • When the preinstall of the yarn workspace shared package is executed, it has actually been copied to the node_modules of app-a , not the current directory, so --cwd ../ does not point to the project root directory.

This is easy to understand. After all, config , as a dependency package, should indeed be copied to the node_modules of the application.

And the second question is not very understandable. Why is --cwd ../ set up, but it is still executed in the current directory? As expected, the pointer to cwd should be:

 1
 /4ark/projects/monorepo/app-a/node_modules/@monorepo

Could it be that my understanding of the cwd parameter is biased? Take a look at the description of cwd in the yarn documentation:

Specifies a current working directory, instead of the default ./ . Use this flag to perform an operation in a working directory that is not the current one.

This can make scripts nicer by avoiding the need to cd into a folder and then cd back out.

From the description of the document, isn’t the role of cwd is to replace cd , but the current result seems that yarn --cwd ../ preinstall is not equivalent to cd ../ && yarn preinstall .

This has to make people wonder about the positioning method of cwd. After searching on the Internet, I can’t find any relevant discussions, so I can only do it myself and find the answer directly from the yarn source code.

Analyze source code

As we mentioned earlier, we are using yarn v1.22.11. In yarn’s GitHub repository, we found that the latest version of v1 is stuck at v1.23.0-0. Then we will analyze the source code of this version. First, clone the code to local:

 1
 git clone --depth=1 https://github.com/yarnpkg/yarn

Then install the dependencies and run:

 1
 yarn && yarn watch

At this time, it will automatically monitor code modification and then recompile. We check package.json and find that yarn’s bin mainly calls ./bin/yarn.js :

 1
2
3
4
 "bin" : {
"yarn" : "./bin/yarn.js" ,
"yarnpkg" : "./bin/yarn.js"
},

That is, the effect of executing bin/yarn.js directly is like executing yarn . Try to check the version:

 1
2
 > /Users/4ark/projects/yarn/bin/yarn -v
1.23.0-0

PS: Of course, you can also use npm link in the project directory to mount it locally.

The next step is to debug, and finally locate the code that can answer our questions, here :

 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
 function findProjectRoot ( base: string ): string {
let prev = null ;
let dir = base;

do {
if (fs.existsSync(path.join(dir, constants.NODE_PACKAGE_JSON))) {
return dir;
}

prev = dir;
dir = path.dirname(dir);
} while (dir !== prev);

return base;
}

const cwd = command.shouldRunInCurrentCwd ? commander.cwd : findProjectRoot(commander.cwd);

It can be seen that the positioning method of cwd is to search for the existence of package.json from the current directory. If it exists, return this directory. Otherwise, process the directory through path.dirname and continue to search until the outermost layer is found.

Then the most important thing here is the return value of path.dirname , let’s take a look at the description of it in the documentation:

The path.dirname() method returns the directory name of a path , similar to the Unix dirname command. Trailing directory separators are ignored,

It is to return the directory part of a path, which is consistent with the dirname command under unix, usually used like this:

 1
2
3
4
5
 > dirname /4ark/app/index.js
/4ark/app

> dirname /4ark/app/packages/index.js
/4ark/app/packages

Wouldn’t it be superficial to think that its function is to return a directory one level above a path? If you pass in an absolute path, you can really think so superficially, but when you pass in a relative path, the situation is different:

 1
2
3
4
5
6
7
8
 > dirname ../app/index.js
../app

> dirname ../../
../

> dirname ../
Q: What will be returned?

The answer is: . , which is the current directory.

Then we can answer our previous question here, why use yarn --cwd ../ preinstall in node_module/@monorepo/config but execute it in the current directory, because its parent node_modules/@monorepo does not have package.json , So after dirname ../ processing, the point of cwd is the current directory.

If you are interested in the implementation of path.dirname in node.js, you can see path.js#L538-L554 here.

solution

After finding out the reason, it is not difficult to solve this problem. As long as we change the relative path to an absolute path, can we solve this problem?

Think about it, in fact, yarn --cwd ../ preinstall , can you change ../ to an absolute path? For example, in the scenario of this article, ../ is actually the root directory of the project, then we can get the root directory of the project in other ways, such as in git:

 1
 git rev-parse --show-toplevel

So, we changed the command to this, and the problem was solved:

 1
2
 - yarn --cwd ../preinstall
+ yarn --cwd $(git rev-parse --show-toplevel) preinstall

Then I have to mention that in fact, a new --top-level attribute has been added to yarn v2, and its function is just to solve this problem.

Epilogue

In fact, let’s go back and think, in the example of this article, there is no need to add the preinstall hook in the config directory at all, because it is a shared package, and every modification must be reinstalled in other places where the package is used. So just make sure that preinstall will be executed in these places, which means that the problems encountered in this article will not occur.

However, it is not a bad thing to step on the pit more, as long as you understand the reasons behind it, the problem is not a problem.

This article is reprinted from: https://4ark.me/post/yarn-cwd-issue.html
This site is for inclusion only, and the copyright belongs to the original author.

Leave a Comment