The right way to terminate find -exec

I came across a problem where I had to copy all *.txt files recursively to another directory. Here’s the content of the workspace directory:

[root@centos ~]# tree workspace/
workspace/
`-- files
    |-- documents
    |   |-- doc0.txt
    |   |-- doc1.txt
    |   `-- doc2.txt
    |-- misc
    |   |-- misc0.txt
    |   |-- misc1.txt
    |   `-- misc2.txt
    `-- notes
        |-- notes0.txt
        |-- notes1.txt
        `-- notes2.txt

4 directories, 9 files

The following command didn’t work as expected:

cp -R workspace/*.txt backup/

Even ls -R workspace/*.txt is not working because the star wildcard is expanded by the shell, and not by the command executing it. That’s why cp is only finding files in the top level directory.

Using find

find is able to find all the .txt files:

[root@centos ~]# find workspace/ -iname "*.txt"
workspace/files/notes/notes0.txt
workspace/files/notes/notes1.txt
workspace/files/notes/notes2.txt
workspace/files/documents/doc0.txt
workspace/files/documents/doc1.txt
workspace/files/documents/doc2.txt
workspace/files/misc/misc0.txt
workspace/files/misc/misc1.txt
workspace/files/misc/misc2.txt

We can pass all files as arguments to cp with find -exec, but first we have to choose either ; or + to terminate the commands.

The difference between ; and + is how the arguments are being parsed:

  • a ; (semicolon) will parse the argument one at a time
[root@centos ~]# find workspace/ -iname "*.txt" -exec echo Arguments: {} ';'
Arguments: workspace/files/notes/notes0.txt
Arguments: workspace/files/notes/notes1.txt
Arguments: workspace/files/notes/notes2.txt
Arguments: workspace/files/documents/doc0.txt
Arguments: workspace/files/documents/doc1.txt
Arguments: workspace/files/documents/doc2.txt
Arguments: workspace/files/misc/misc0.txt
Arguments: workspace/files/misc/misc1.txt
Arguments: workspace/files/misc/misc2.txt

echo was executed 9 times.

  • + (plus sign) will parse all the arguments at once(output redacted):
[root@centos ~]# find workspace/ -iname "*.txt" -exec echo Arguments: {} '+'
Arguments: workspace/files/notes/notes0.txt workspace/files/notes/notes1.txt ... ...

A single occurence of echo.

Note: There is a limited length of command line arguments. It’s complicated. You can read more about it here.

To avoid exceeding the maximum limit, use the + with find -exec or xargs.

Let’s solve the problem with:

find workspace/ -iname "*.txt" -exec cp -t backup/ {} +

Result:

[root@centos ~]# tree workspace/ backup/
workspace/
`-- files
    |-- documents
    |   |-- doc0.txt
    |   |-- doc1.txt
    |   `-- doc2.txt
    |-- misc
    |   |-- misc0.txt
    |   |-- misc1.txt
    |   `-- misc2.txt
    `-- notes
        |-- notes0.txt
        |-- notes1.txt
        `-- notes2.txt
backup/
|-- doc0.txt
|-- doc1.txt
|-- doc2.txt
|-- misc0.txt
|-- misc1.txt
|-- misc2.txt
|-- notes0.txt
|-- notes1.txt
`-- notes2.txt

4 directories, 18 files

We can use the + sign with cp, because only the last argument will be parsed as the destination folder.

Solutions for listing files

After spending some time on stack overflow, I found 3 more techniques on how to list files in subdirectories using wildcard:

  • Simple glob:
ls */*.txt
  • Recursive using bash(ugly syntax):
shopt -s globstar
ls **/*.txt
  • and locate:
updatedb
locate */*.txt

I prefer locate and find.


“Don’t just sit there. Do something. The answers will follow.” - Mark Manson, The Subtle Art of Not Giving a Fuck