By ibz


2009-06-08 14:00:29 8 Comments

I want to get the filename (without extension) and the extension separately.

The best solution I found so far is:

NAME=`echo "$FILE" | cut -d'.' -f1`
EXTENSION=`echo "$FILE" | cut -d'.' -f2`

This is wrong because it doesn't work if the file name contains multiple . characters. If, let's say, I have a.b.js, it will consider a and b.js, instead of a.b and js.

It can be easily done in Python with

file, ext = os.path.splitext(path)

but I'd prefer not to fire up a Python interpreter just for this, if possible.

Any better ideas?

30 comments

@paxdiablo 2009-06-08 14:14:37

pax> echo a.b.js | sed 's/\.[^.]*$//'
a.b
pax> echo a.b.js | sed 's/^.*\.//'
js

works fine, so you can just use:

pax> FILE=a.b.js
pax> NAME=$(echo "$FILE" | sed 's/\.[^.]*$//')
pax> EXTENSION=$(echo "$FILE" | sed 's/^.*\.//')
pax> echo $NAME
a.b
pax> echo $EXTENSION
js

The commands, by the way, work as follows.

The command for NAME substitutes a "." character followed by any number of non-"." characters up to the end of the line, with nothing (i.e., it removes everything from the final "." to the end of the line, inclusive). This is basically a non-greedy substitution using regex trickery.

The command for EXTENSION substitutes a any number of characters followed by a "." character at the start of the line, with nothing (i.e., it removes everything from the start of the line to the final dot, inclusive). This is a greedy substitution which is the default action.

@hIpPy 2018-10-07 05:11:06

This break for files without extension as it would print the same for name and extension. So I use sed 's,\.[^\.]*$,,' for name, and sed 's,.*\.,., ;t ;g' for extension (uses the atypical test and get commands, along with the typical substitute command).

@sotapme 2013-02-05 09:09:58

You can use the magic of POSIX parameter expansion:

bash-3.2$ FILENAME=somefile.tar.gz
bash-3.2$ echo "${FILENAME%%.*}"
somefile
bash-3.2$ echo "${FILENAME%.*}"
somefile.tar

There's a caveat in that if your filename was of the form ./somefile.tar.gz then echo ${FILENAME%%.*} would greedily remove the longest match to the . and you'd have the empty string.

(You can work around that with a temporary variable:

FULL_FILENAME=$FILENAME
FILENAME=${FULL_FILENAME##*/}
echo ${FILENAME%%.*}

)


This site explains more.

${variable%pattern}
  Trim the shortest match from the end
${variable##pattern}
  Trim the longest match from the beginning
${variable%%pattern}
  Trim the longest match from the end
${variable#pattern}
  Trim the shortest match from the beginning

@jwadsack 2014-07-18 16:40:25

Much simpler than Joachim's answer but I always have to look up POSIX variable substitution. Also, this runs on Max OSX where cut doesn't have --complement and sed doesn't have -r.

@Juliano 2009-06-08 14:05:13

~% FILE="example.tar.gz"

~% echo "${FILE%%.*}"
example

~% echo "${FILE%.*}"
example.tar

~% echo "${FILE#*.}"
tar.gz

~% echo "${FILE##*.}"
gz

For more details, see shell parameter expansion in the Bash manual.

@rmeador 2009-06-08 14:50:20

You (perhaps unintentionally) bring up the excellent question of what to do if the "extension" part of the filename has 2 dots in it, as in .tar.gz... I've never considered that issue, and I suspect it's not solvable without knowing all the possible valid file extensions up front.

@Juliano 2009-06-08 15:20:49

Why not solvable? In my example, it should be considered that the file contains two extensions, not an extension with two dots. You handle both extensions separately.

@porges 2009-06-13 09:11:04

It is unsolvable on a lexical basis, you'll need to check the file type. Consider if you had a game called dinosaurs.in.tar and you gzipped it to dinosaurs.in.tar.gz :)

@iksemyonov 2012-03-02 07:46:33

Doesn't work on Fedora 16 x86_64 GNU bash, version 4.2.20(1)-release (x86_64-redhat-linux-gnu). Both ${FILE#.*} and ${FILE##.*} return example.tar.gz.

@Walt Sellers 2012-03-05 18:49:56

This gets more complicated if you are passing in full paths. One of mine had a '.' in a directory in the middle of the path, but none in the file name. Example "a/b.c/d/e/filename" would wind up ".c/d/e/filename"

@v.oddou 2013-11-26 07:29:17

clearly no x.tar.gz's extension is gz and the filename is x.tar that is it. There is no such thing as dual extensions. i'm pretty sure boost::filesystem handles it that way. (split path, change_extension...) and its behavior is based on python if I'm not mistaken.

@John Oxley 2016-10-18 14:59:25

If your file has .s in it and and the "extension" .tar.gz, you can use ${FILE%.*.*}

@Ken Mueller 2019-06-08 07:42:26

This is the only one that worked for me:

path='folder/other_folder/file.js'

base=${path##*/}
echo ${base%.*}

>> file

This can also be used in string interpolation as well, but unfortunately you have to set base beforehand.

@SCS 2011-07-01 08:20:22

You can also use a for loop and tr to extract the filename from the path...

for x in `echo $path | tr "/" " "`; do filename=$x; done

The tr replaces all "/" delimiters in path with spaces so making a list of strings, and the for loop scans through them leaving the last one in the filename variable.

@dubiousjim 2012-05-30 21:44:39

If you're going to do it this way, save yourself some forks and instead use: (IFS=/ ; for x in $path; do filename=$x; done). The (...) subshell is needed to localize the assignment to IFS.

@scolfax 2016-10-13 05:56:44

Here is a sed solution that extracts path components in a variety of forms and can handle most edge cases:

## Enter the input path and field separator character, for example:
## (separatorChar must not be present in inputPath)

inputPath="/path/to/Foo.bar"
separatorChar=":"

## sed extracts the path components and assigns them to output variables

oldIFS="$IFS"
IFS="$separatorChar"
read dirPathWithSlash dirPath fileNameWithExt fileName fileExtWithDot fileExt <<<"$(sed -En '
s/^[[:space:]]+//
s/[[:space:]]+$//
t l1
:l1
s/^([^/]|$)//
t
s/[/]+$//
t l2
:l2
s/^$/filesystem\/\
filesystem/p
t
h
s/^(.*)([/])([^/]+)$/\1\2\
\1\
\3/p
g
t l3
:l3
s/^.*[/]([^/]+)([.])([a-zA-Z0-9]+)$/\1\
\2\3\
\3/p
t
s/^.*[/](.+)$/\1/p
' <<<"$inputPath" | tr "\n" "$separatorChar")"
IFS="$oldIFS"

## Results (all use separatorChar=":")

## inputPath        = /path/to/Foo.bar
## dirPathWithSlash = /path/to/
## dirPath          = /path/to 
## fileNameWithExt  = Foo.bar
## fileName         = Foo
## fileExtWithDot   = .bar
## fileExt          = bar

## inputPath        = /path/to/Foobar
## dirPathWithSlash = /path/to/
## dirPath          = /path/to 
## fileNameWithExt  = Foobar
## fileName         = Foobar
## fileExtWithDot   =
## fileExt          =

## inputPath        = /path/to/...bar
## dirPathWithSlash = /path/to/
## dirPath          = /path/to 
## fileNameWithExt  = ...bar
## fileName         = ..
## fileExtWithDot   = .bar
## fileExt          = bar

## inputPath        = /path/to/..bar
## dirPathWithSlash = /path/to/
## dirPath          = /path/to 
## fileNameWithExt  = ..bar
## fileName         = .
## fileExtWithDot   = .bar
## fileExt          = bar

## inputPath        = /path/to/.bar
## dirPathWithSlash = /path/to/
## dirPath          = /path/to 
## fileNameWithExt  = .bar
## fileName         = .bar
## fileExtWithDot   = 
## fileExt          = 

## inputPath        = /path/to/...
## dirPathWithSlash = /path/to/
## dirPath          = /path/to 
## fileNameWithExt  = ...
## fileName         = ...
## fileExtWithDot   =
## fileExt          =

## inputPath        = /path/to/Foo.
## dirPathWithSlash = /path/to/
## dirPath          = /path/to 
## fileNameWithExt  = Foo.
## fileName         = Foo.
## fileExtWithDot   =
## fileExt          =

## inputPath        = / (the root directory)
## dirPathWithSlash = filesystem/
## dirPath          = filesystem
## fileNameWithExt  =
## fileName         =
## fileExtWithDot   =
## fileExt          =

## inputPath        =  (invalid because empty)
## dirPathWithSlash =
## dirPath          =
## fileNameWithExt  =
## fileName         =
## fileExtWithDot   =
## fileExt          =

## inputPath        = Foo/bar (invalid because doesn't start with a forward slash)
## dirPathWithSlash =
## dirPath          =
## fileNameWithExt  =
## fileName         =
## fileExtWithDot   =
## fileExt          =

Here's how it works:

sed parses the input path and prints the following path components in order on separate lines:

  • directory path with a trailing slash character
  • directory path without a trailing slash character
  • file name with extension
  • file name without extension
  • file extension with a leading dot character
  • file extension without a leading dot character

tr converts the sed output into a separator character-delimited string of the above path components.

read uses the separator character as the field separator (IFS="$separatorChar") and assigns each of the path components to its respective variable.

Here's how the sed construct works:

  • s/^[[:space:]]+// and s/[[:space:]]+$// strip any leading and/or trailing whitespace characters
  • t l1 and :l1 refreshes the t function for the next s function
  • s/^([^/]|$)// and t tests for an invalid input path (one that does not begin with a forward slash), in which case it leaves all output lines blank and quits the sed command
  • s/[/]+$// strips any trailing slashes
  • t l2 and :l2 refreshes the t function for the next s function
  • s/^$/filesystem\/\\[newline]filesystem/p and t tests for the special case where the input path consists of the root directory /, in which case it prints filesystem/ and filesystem for the dirPathWithSlash and dirPath output lines, leaves all other output lines blank, and quits the sed command
  • h saves the input path in the hold space
  • s/^(.*)([/])([^/]+)$/\1\2\\[newline]\1\\[newline]\3/p prints the dirPathWithSlash, dirPath, and fileNameWithExt output lines
  • g retrieves the input path from the hold space
  • t l3 and :l3 refreshes the t function for the next s function
  • s/^.*\[/]([^/]+)([.])([a-zA-Z0-9]+)$/\1\\[newline]\2\3\\[newline]\3/p and t prints the fileName, fileExtWithDot, and fileExt output lines for the case where a file extension exists, (assumed to consist of alphanumeric characters only), then quits the sed command
  • s/^.*\[/](.+)$/\1/p prints the fileName but not the fileExtWithDot, and fileExt output lines for the case where a file extension does not exist, then quits the sed command.

@Alex Gray 2013-08-27 02:18:45

Based largely off of @mklement0's excellent, and chock-full of random, useful bashisms - as well as other answers to this / other questions / "that darn internet"... I wrapped it all up in a little, slightly more comprehensible, reusable function for my (or your) .bash_profile that takes care of what (I consider) should be a more robust version of dirname/basename / what have you..

function path { SAVEIFS=$IFS; IFS=""   # stash IFS for safe-keeping, etc.
    [[ $# != 2 ]] && echo "usage: path <path> <dir|name|fullname|ext>" && return    # demand 2 arguments
    [[ $1 =~ ^(.*/)?(.+)?$ ]] && {     # regex parse the path
        dir=${BASH_REMATCH[1]}
        file=${BASH_REMATCH[2]}
        ext=$([[ $file = *.* ]] && printf %s ${file##*.} || printf '')
        # edge cases for extensionless files and files like ".nesh_profile.coffee"
        [[ $file == $ext ]] && fnr=$file && ext='' || fnr=${file:0:$((${#file}-${#ext}))}
        case "$2" in
             dir) echo      "${dir%/*}"; ;;
            name) echo      "${fnr%.*}"; ;;
        fullname) echo "${fnr%.*}.$ext"; ;;
             ext) echo           "$ext"; ;;
        esac
    }
    IFS=$SAVEIFS
}     

Usage examples...

SOMEPATH=/path/to.some/.random\ file.gzip
path $SOMEPATH dir        # /path/to.some
path $SOMEPATH name       # .random file
path $SOMEPATH ext        # gzip
path $SOMEPATH fullname   # .random file.gzip                     
path gobbledygook         # usage: -bash <path> <dir|name|fullname|ext>

@mklement0 2013-11-26 14:18:42

Nicely done; a few suggestions: - You don't seem to be relying on $IFS at all (and if you were, you could use local to localize the effect of setting it). - Better to use local variables. - Your error message should be output to stderr, not stdout (use 1>&2), and you should return a non-zero exit code. - Better to rename fullname to basename (the former suggests a path with dir components). - name unconditionally appends a . (period), even if the original had none. You could simply use the basename utility, but note that it ignores a terminating /.

@mklement0 2013-08-09 03:45:30

The accepted answer works well in typical cases, but fails in edge cases, namely:

  • For filenames without extension (called suffix in the remainder of this answer), extension=${filename##*.} returns the input filename rather than an empty string.
  • extension=${filename##*.} does not include the initial ., contrary to convention.
    • Blindly prepending . would not work for filenames without suffix.
  • filename="${filename%.*}" will be the empty string, if the input file name starts with . and contains no further . characters (e.g., .bash_profile) - contrary to convention.

---------

Thus, the complexity of a robust solution that covers all edge cases calls for a function - see its definition below; it can return all components of a path.

Example call:

splitPath '/etc/bash.bashrc' dir fname fnameroot suffix
# -> $dir == '/etc'
# -> $fname == 'bash.bashrc'
# -> $fnameroot == 'bash'
# -> $suffix == '.bashrc'

Note that the arguments after the input path are freely chosen, positional variable names.
To skip variables not of interest that come before those that are, specify _ (to use throw-away variable $_) or ''; e.g., to extract filename root and extension only, use splitPath '/etc/bash.bashrc' _ _ fnameroot extension.


# SYNOPSIS
#   splitPath path varDirname [varBasename [varBasenameRoot [varSuffix]]] 
# DESCRIPTION
#   Splits the specified input path into its components and returns them by assigning
#   them to variables with the specified *names*.
#   Specify '' or throw-away variable _ to skip earlier variables, if necessary.
#   The filename suffix, if any, always starts with '.' - only the *last*
#   '.'-prefixed token is reported as the suffix.
#   As with `dirname`, varDirname will report '.' (current dir) for input paths
#   that are mere filenames, and '/' for the root dir.
#   As with `dirname` and `basename`, a trailing '/' in the input path is ignored.
#   A '.' as the very first char. of a filename is NOT considered the beginning
#   of a filename suffix.
# EXAMPLE
#   splitPath '/home/jdoe/readme.txt' parentpath fname fnameroot suffix
#   echo "$parentpath" # -> '/home/jdoe'
#   echo "$fname" # -> 'readme.txt'
#   echo "$fnameroot" # -> 'readme'
#   echo "$suffix" # -> '.txt'
#   ---
#   splitPath '/home/jdoe/readme.txt' _ _ fnameroot
#   echo "$fnameroot" # -> 'readme'  
splitPath() {
  local _sp_dirname= _sp_basename= _sp_basename_root= _sp_suffix=
    # simple argument validation
  (( $# >= 2 )) || { echo "$FUNCNAME: ERROR: Specify an input path and at least 1 output variable name." >&2; exit 2; }
    # extract dirname (parent path) and basename (filename)
  _sp_dirname=$(dirname "$1")
  _sp_basename=$(basename "$1")
    # determine suffix, if any
  _sp_suffix=$([[ $_sp_basename = *.* ]] && printf %s ".${_sp_basename##*.}" || printf '')
    # determine basename root (filemane w/o suffix)
  if [[ "$_sp_basename" == "$_sp_suffix" ]]; then # does filename start with '.'?
      _sp_basename_root=$_sp_basename
      _sp_suffix=''
  else # strip suffix from filename
    _sp_basename_root=${_sp_basename%$_sp_suffix}
  fi
  # assign to output vars.
  [[ -n $2 ]] && printf -v "$2" "$_sp_dirname"
  [[ -n $3 ]] && printf -v "$3" "$_sp_basename"
  [[ -n $4 ]] && printf -v "$4" "$_sp_basename_root"
  [[ -n $5 ]] && printf -v "$5" "$_sp_suffix"
  return 0
}

test_paths=(
  '/etc/bash.bashrc'
  '/usr/bin/grep'
  '/Users/jdoe/.bash_profile'
  '/Library/Application Support/'
  'readme.new.txt'
)

for p in "${test_paths[@]}"; do
  echo ----- "$p"
  parentpath= fname= fnameroot= suffix=
  splitPath "$p" parentpath fname fnameroot suffix
  for n in parentpath fname fnameroot suffix; do
    echo "$n=${!n}"
  done
done

Test code that exercises the function:

test_paths=(
  '/etc/bash.bashrc'
  '/usr/bin/grep'
  '/Users/jdoe/.bash_profile'
  '/Library/Application Support/'
  'readme.new.txt'
)

for p in "${test_paths[@]}"; do
  echo ----- "$p"
  parentpath= fname= fnameroot= suffix=
  splitPath "$p" parentpath fname fnameroot suffix
  for n in parentpath fname fnameroot suffix; do
    echo "$n=${!n}"
  done
done

Expected output - note the edge cases:

  • a filename having no suffix
  • a filename starting with . (not considered the start of the suffix)
  • an input path ending in / (trailing / is ignored)
  • an input path that is a filename only (. is returned as the parent path)
  • a filename that has more than .-prefixed token (only the last is considered the suffix):
----- /etc/bash.bashrc
parentpath=/etc
fname=bash.bashrc
fnameroot=bash
suffix=.bashrc
----- /usr/bin/grep
parentpath=/usr/bin
fname=grep
fnameroot=grep
suffix=
----- /Users/jdoe/.bash_profile
parentpath=/Users/jdoe
fname=.bash_profile
fnameroot=.bash_profile
suffix=
----- /Library/Application Support/
parentpath=/Library
fname=Application Support
fnameroot=Application Support
suffix=
----- readme.new.txt
parentpath=.
fname=readme.new.txt
fnameroot=readme.new
suffix=.txt

@Bruno BEAUFILS 2018-12-11 02:16:21

IMHO the best solution has already been given (using shell parameter expansion) and are the best rated one at this time.

I however add this one which just use dumbs commands, which is not efficient and which noone serious should use ever :

FILENAME=$(echo $FILE | cut -d . -f 1-$(printf $FILE | tr . '\n' | wc -l))
EXTENSION=$(echo $FILE | tr . '\n' | tail -1)

Added just for fun :-)

@Ron 2017-04-22 14:39:16

Smallest and simplest solution (in single line) is:

$ file=/blaabla/bla/blah/foo.txt
echo $(basename ${file%.*}) # foo

@tripleee 2017-11-10 10:34:17

That's a useless use of echo. In general, echo $(command) is better written simply command unless you specifically require the shell to perform whitespace tokenization and wildcard expansion on the output from command before displaying the result. Quiz: what's the output of echo $(echo '*') (and if that's what you really want, you really really want just echo *).

@Ron 2018-04-17 10:07:42

@triplee I didn't use echo command at all. I just used it to demonstrate the result foo which is appearing in the 3rd line as the result of the 2nd line.

@tripleee 2018-04-17 10:35:30

But just basename "${file%.*}" would do the same; you are using a command substitution to capture its output, only to echo that same output immediately. (Without quoting, the result is nominally different; but that's hardly relevant, much less a feature, here.)

@tripleee 2018-04-17 11:21:02

Also basename "$file" .txt avoids the complexity of the parameter substitution.

@Ron 2018-04-18 17:17:03

Just read my previous reply carefully before talking about unrelated stuff and wasting our time. Thanks

@frederick99 2018-11-16 17:27:38

@Ron Read his first comment before accusing him of wasting our time.

@Ron 2018-11-17 19:03:04

@frederick99 I did. And I replied!. He's talking blah blah about echo which is not even a the part of the command. all you need to extract from my code is this part: $(basename ${file%.*}) Don't even look at echo. It has no purpose, but It's still in the example just to help people to get a clear idea about what's happening under the hood. echo is completely unrelated to this topic "extract filename and extension in bash".

@frederick99 2018-11-18 03:28:22

@Ron I know it's not related to the question but he had a point, even if it was not relevant. I felt what he said was interesting and definitely not a waste of time (unlike this :l).

@Petesh 2009-06-08 14:05:19

First, get file name without the path:

filename=$(basename -- "$fullfile")
extension="${filename##*.}"
filename="${filename%.*}"

Alternatively, you can focus on the last '/' of the path instead of the '.' which should work even if you have unpredictable file extensions:

filename="${fullfile##*/}"

You may want to check the documentation :

@D.Shawley 2009-06-08 14:08:51

Check out gnu.org/software/bash/manual/html_node/… for the full feature set.

@lhunath 2009-06-08 14:34:48

Add some quotes to "$fullfile", or you'll risk breaking the filename.

@ephemient 2009-06-09 17:52:24

Heck, you could even write filename="${fullfile##*/}" and avoid calling an extra basename

@nccc 2012-07-01 03:42:00

This "solution" does not work if the file does not have an extension -- instead, the whole file name is output, which is quite bad considering that files without extensions are omnipresent.

@mklement0 2012-09-07 14:41:40

Fix for dealing with file names without extension: extension=$([[ "$filename" = *.* ]] && echo ".${filename##*.}" || echo ''). Note that if an extension is present, it will be returned including the initial ., e.g., .txt.

@mklement0 2012-09-07 15:08:56

Note the following about this solution: - For file names without extension, $extension returns the file name rather than "". - $extension does not include the initial ".". - $filename will be "", if the input file name starts with "." and contains no further "." characters (e.g., .bash_profile).

@Keith Thompson 2013-08-09 00:09:18

I've sometimes wished that basename had an option to remove an arbitrary extension. basename foo.txt .txt prints foo, but it can't strip the .txt unless you tell it what the suffix is. (Yes, it's easy enough to do with other facilities, but "${filename%.*}"isn't exactly easy to remember.)

@v.oddou 2013-11-26 07:27:09

@mklement0 : .bash_profile is a filename without a base in my sense. it is composed of just an extension. maybe that sounds weird, but Special cases aren't special enough to break the rules.

@mklement0 2013-11-26 14:23:15

@v.oddou: Fair enough, it certainly simplifies the solution greatly (cf. the madness of my answer). However, it may not be what people expect (I didn't), so I thought it worth pointing out - initial-period-to-mark-hidden-files and period-as-extension-separator are competing concepts.

@v.oddou 2013-11-27 08:25:37

@mklement0 : Yes and I'd like a definite definition as well. Everybody should agree on this one, I think its important too. I just happen to vote for initial-periods having to be interpreted as extension separators, because it removes a special case. But that's just my personal vote. I think we should check how boost::filesystem, python and MS Win32 SplitPath... parses this kind of files to get our answer. And if they all agree, accept that as convention.

@mklement0 2013-11-27 20:42:01

@v.oddou: Results: Python, Ruby, node.js, and Perl make the exception (treat . at the beginning special); by contrast, the Boost C++ library does not (and neither does .NET, but I don't think that matters, given that it was designed for Windows).

@tomsgd 2018-06-13 02:27:07

What is the reason for the -- in filename=$(basename -- "$fullfile") From my local tests, it seems to work just as well without it but am I missing something?

@Evidlo 2019-01-23 03:28:55

Alternatively, one could add function rootname(){ echo "${1%.*}"; } to their .bashrc

@kenn 2020-03-24 15:10:46

It won't work for files which have multiple dots in their names. This one fails echo ${cmake-3.16.4.tar.gz##*.}

@msoutopico 2020-09-03 18:11:25

This solution worked for me but you have to be careful if a) the name contains other dots and b) if $fullfile has a pattern (which will not expand because of the quotes). Removing the quotes seems to make it work fine in those cases: touch okapiFiltersForOmegaT-1.8-1.40.0.jar && jarfile="okapiFiltersForOmegaT*.jar" && filename=$(basename -- $jarfile) && extension="${filename##*.}" && filename="${filename%.*}". That gives me the expected okapiFiltersForOmegaT-1.8-1.40.0.

@enyo 2010-07-12 12:44:04

Simply use ${parameter%word}

In your case:

${FILE%.*}

If you want to test it, all following work, and just remove the extension:

FILE=abc.xyz; echo ${FILE%.*};
FILE=123.abc.xyz; echo ${FILE%.*};
FILE=abc; echo ${FILE%.*};

@SilverWolf - Reinstate Monica 2018-05-14 15:24:26

Why the downvote? It's still useful, although there shouldn't be spaces around the = signs.

@Alex. S. 2018-12-19 03:11:26

This works fine. Thank you! (now it doesn't have the spaces around the equal signs, if that was why it was downvoted)

@phil294 2017-09-16 04:41:23

If you also want to allow empty extensions, this is the shortest I could come up with:

echo 'hello.txt' | sed -r 's/.+\.(.+)|.*/\1/' # EXTENSION
echo 'hello.txt' | sed -r 's/(.+)\..+|(.*)/\1\2/' # FILENAME

1st line explained: It matches PATH.EXT or ANYTHING and replaces it with EXT. If ANYTHING was matched, the ext group is not captured.

@Cyker 2016-12-02 09:04:43

No need to bother with awk or sed or even perl for this simple task. There is a pure-Bash, os.path.splitext()-compatible solution which only uses parameter expansions.

Reference Implementation

Documentation of os.path.splitext(path):

Split the pathname path into a pair (root, ext) such that root + ext == path, and ext is empty or begins with a period and contains at most one period. Leading periods on the basename are ignored; splitext('.cshrc') returns ('.cshrc', '').

Python code:

root, ext = os.path.splitext(path)

Bash Implementation

Honoring leading periods

root="${path%.*}"
ext="${path#"$root"}"

Ignoring leading periods

root="${path#.}";root="${path%"$root"}${root%.*}"
ext="${path#"$root"}"

Tests

Here are test cases for the Ignoring leading periods implementation, which should match the Python reference implementation on every input.

|---------------|-----------|-------|
|path           |root       |ext    |
|---------------|-----------|-------|
|' .txt'        |' '        |'.txt' |
|' .txt.txt'    |' .txt'    |'.txt' |
|' txt'         |' txt'     |''     |
|'*.txt.txt'    |'*.txt'    |'.txt' |
|'.cshrc'       |'.cshrc'   |''     |
|'.txt'         |'.txt'     |''     |
|'?.txt.txt'    |'?.txt'    |'.txt' |
|'\n.txt.txt'   |'\n.txt'   |'.txt' |
|'\t.txt.txt'   |'\t.txt'   |'.txt' |
|'a b.txt.txt'  |'a b.txt'  |'.txt' |
|'a*b.txt.txt'  |'a*b.txt'  |'.txt' |
|'a?b.txt.txt'  |'a?b.txt'  |'.txt' |
|'a\nb.txt.txt' |'a\nb.txt' |'.txt' |
|'a\tb.txt.txt' |'a\tb.txt' |'.txt' |
|'txt'          |'txt'      |''     |
|'txt.pdf'      |'txt'      |'.pdf' |
|'txt.tar.gz'   |'txt.tar'  |'.gz'  |
|'txt.txt'      |'txt'      |'.txt' |
|---------------|-----------|-------|

Test Results

All tests passed.

@frederick99 2018-11-16 17:22:15

no, the base file name for text.tar.gz should be text and extension be .tar.gz

@Cyker 2018-12-25 19:11:37

@frederick99 As I said the solution here matches the implementation of os.path.splitext in Python. Whether that implementation is sane for possibly controversial inputs is another topic.

@ymett 2019-06-12 08:14:18

How do the quotes within the pattern ("$root") work? What could happen if they were omitted? (I couldn't find any documentation on the matter.) Also how does this handle filenames with * or ? in them?

@ymett 2019-06-12 08:25:16

Ok, testing shows me that the quotes make the pattern a literal, i.e. * and ? aren't special. So the two parts of my question answer each other. Am I correct that this isn't documented? Or is this supposed to be understood from the fact that quotes disable glob expansion in general?

@Maëlan 2020-02-09 15:23:20

Brilliant answer! I’ll just suggest a slightly simpler variant for computing the root: root="${path#?}";root="${path::1}${root%.*}" — then proceed the same to extract the extension.

@Al3xXx 2011-12-09 19:27:02

Ok so if I understand correctly, the problem here is how to get the name and the full extension of a file that has multiple extensions, e.g., stuff.tar.gz.

This works for me:

fullfile="stuff.tar.gz"
fileExt=${fullfile#*.}
fileName=${fullfile%*.$fileExt}

This will give you stuff as filename and .tar.gz as extension. It works for any number of extensions, including 0. Hope this helps for anyone having the same problem =)

@Cyker 2016-12-02 09:15:50

The correct result (according to os.path.splitext, which is what the OP wants) is ('stuff.tar', '.gz').

@cvr 2015-12-04 16:26:45

Building from Petesh answer, if only the filename is needed, both path and extension can be stripped in a single line,

filename=$(basename ${fullname%.*})

@helmy 2016-01-20 17:29:32

Did not work for me: "basename: missing operand Try 'basename --help' for more information."

@cvr 2016-02-01 23:14:41

Strange, are you certain you're using Bash? In my case, with both versions 3.2.25 (old CentOS) and 4.3.30 (Debian Jessie) it works flawlessly.

@Adrian 2017-03-14 20:08:11

Maybe there is a space in the filename? Try using filename="$(basename "${fullname%.*}")"

@tripleee 2017-11-10 10:38:23

The second argument to basename is optional, but specifies the extension to strip off. The substitution might still be useful but perhaps basename actually isn't, since you can actually perform all of these substitutions with shell builtins.

@maciek gajewski 2012-09-10 13:46:00

You can force cut to display all fields and subsequent ones adding - to field number.

NAME=`basename "$FILE"`
EXTENSION=`echo "$NAME" | cut -d'.' -f2-`

So if FILE is eth0.pcap.gz, the EXTENSION will be pcap.gz

Using the same logic, you can also fetch the file name using '-' with cut as follows :

NAME=`basename "$FILE" | cut -d'.' -f-1`

This works even for filenames that do not have any extension.

@henfiber 2015-06-16 09:02:52

Here are some alternative suggestions (mostly in awk), including some advanced use cases, like extracting version numbers for software packages.

f='/path/to/complex/file.1.0.1.tar.gz'

# Filename : 'file.1.0.x.tar.gz'
    echo "$f" | awk -F'/' '{print $NF}'

# Extension (last): 'gz'
    echo "$f" | awk -F'[.]' '{print $NF}'

# Extension (all) : '1.0.1.tar.gz'
    echo "$f" | awk '{sub(/[^.]*[.]/, "", $0)} 1'

# Extension (last-2): 'tar.gz'
    echo "$f" | awk -F'[.]' '{print $(NF-1)"."$NF}'

# Basename : 'file'
    echo "$f" | awk '{gsub(/.*[/]|[.].*/, "", $0)} 1'

# Basename-extended : 'file.1.0.1.tar'
    echo "$f" | awk '{gsub(/.*[/]|[.]{1}[^.]+$/, "", $0)} 1'

# Path : '/path/to/complex/'
    echo "$f" | awk '{match($0, /.*[/]/, a); print a[0]}'
    # or 
    echo "$f" | grep -Eo '.*[/]'

# Folder (containing the file) : 'complex'
    echo "$f" | awk -F'/' '{$1=""; print $(NF-1)}'

# Version : '1.0.1'
    # Defined as 'number.number' or 'number.number.number'
    echo "$f" | grep -Eo '[0-9]+[.]+[0-9]+[.]?[0-9]?'

    # Version - major : '1'
    echo "$f" | grep -Eo '[0-9]+[.]+[0-9]+[.]?[0-9]?' | cut -d. -f1

    # Version - minor : '0'
    echo "$f" | grep -Eo '[0-9]+[.]+[0-9]+[.]?[0-9]?' | cut -d. -f2

    # Version - patch : '1'
    echo "$f" | grep -Eo '[0-9]+[.]+[0-9]+[.]?[0-9]?' | cut -d. -f3

# All Components : "path to complex file 1 0 1 tar gz"
    echo "$f" | awk -F'[/.]' '{$1=""; print $0}'

# Is absolute : True (exit-code : 0)
    # Return true if it is an absolute path (starting with '/' or '~/'
    echo "$f" | grep -q '^[/]\|^~/'

All use cases are using the original full path as input, without depending on intermediate results.

@RandyP 2014-06-09 21:18:42

A simple answer:

To expand on the POSIX variables answer, note that you can do more interesting patterns. So for the case detailed here, you could simply do this:

tar -zxvf $1
cd ${1%.tar.*}

That will cut off the last occurrence of .tar.<something>.

More generally, if you wanted to remove the last occurrence of .<something>.<something-else> then

${1.*.*}

should work fine.

The link the above answer appears to be dead. Here's a great explanation of a bunch of the string manipulation you can do directly in Bash, from TLDP.

@tonix 2015-01-02 09:42:01

Is there a way to make the match case-insensitive?

@commonpike 2014-01-03 11:14:33

From the answers above, the shortest oneliner to mimic Python's

file, ext = os.path.splitext(path)

presuming your file really does have an extension, is

EXT="${PATH##*.}"; FILE=$(basename "$PATH" .$EXT)

@commonpike 2014-12-11 11:26:11

I've got downvotes on this. I'm considering to remove the answer, people somehow dislike it.

@David Cullen 2016-02-09 17:19:03

basename does not remove the extension, just the path.

@David Cullen 2016-02-10 00:13:38

It has been so long since I looked at the man page I forgot about the SUFFIX option.

@tripleee 2017-11-10 10:39:38

You have to know which extension you want to strip off before you know what to put in EXT so this is turtles all the way down. (Also, you should avoid all uppercase for your private variable names; they are reserved for system variables.)

@F. Hauri 2013-07-07 15:47:25

Magic file recognition

In addition to the lot of good answers on this Stack Overflow question I would like to add:

Under Linux and other unixen, there is a magic command named file, that do filetype detection by analysing some first bytes of file. This is a very old tool, initialy used for print servers (if not created for... I'm not sure about that).

file myfile.txt
myfile.txt: UTF-8 Unicode text

file -b --mime-type myfile.txt
text/plain

Standards extensions could be found in /etc/mime.types (on my Debian GNU/Linux desktop. See man file and man mime.types. Perhaps you have to install the file utility and mime-support packages):

grep $( file -b --mime-type myfile.txt ) </etc/mime.types
text/plain      asc txt text pot brf srt

You could create a function for determining right extension. There is a little (not perfect) sample:

file2ext() {
    local _mimetype=$(file -Lb --mime-type "$1") _line _basemimetype
    case ${_mimetype##*[/.-]} in
        gzip | bzip2 | xz | z )
            _mimetype=${_mimetype##*[/.-]}
            _mimetype=${_mimetype//ip}
            _basemimetype=$(file -zLb --mime-type "$1")
            ;;
        stream )
            _mimetype=($(file -Lb "$1"))
            [ "${_mimetype[1]}" = "compressed" ] &&
                _basemimetype=$(file -b --mime-type - < <(
                        ${_mimetype,,} -d <"$1")) ||
                _basemimetype=${_mimetype,,}
            _mimetype=${_mimetype,,}
            ;;
        executable )  _mimetype='' _basemimetype='' ;;
        dosexec )     _mimetype='' _basemimetype='exe' ;;
        shellscript ) _mimetype='' _basemimetype='sh' ;;
        * )
            _basemimetype=$_mimetype
            _mimetype=''
            ;;
    esac
    while read -a _line ;do
        if [ "$_line" == "$_basemimetype" ] ;then
            [ "$_line[1]" ] &&
                _basemimetype=${_line[1]} ||
                _basemimetype=${_basemimetype##*[/.-]}
            break
        fi
        done </etc/mime.types
    case ${_basemimetype##*[/.-]} in
        executable ) _basemimetype='' ;;
        shellscript ) _basemimetype='sh' ;;
        dosexec ) _basemimetype='exe' ;;
        * ) ;;
    esac
    [ "$_mimetype" ] && [ "$_basemimetype" != "$_mimetype" ] &&
      printf ${2+-v} $2 "%s.%s" ${_basemimetype##*[/.-]} ${_mimetype##*[/.-]} ||
      printf ${2+-v} $2 "%s" ${_basemimetype##*[/.-]}
}

This function could set a Bash variable that can be used later:

(This is inspired from @Petesh right answer):

filename=$(basename "$fullfile")
filename="${filename%.*}"
file2ext "$fullfile" extension

echo "$fullfile -> $filename . $extension"

@Édouard Lopez 2013-02-05 08:51:39

Maybe there is an option in tar to do this; did you check the man? Otherwise, you can use Bash string expansion:

test="mpc-1.0.1.tar.gz"
noExt="${test/.tar.gz/}" # Remove the string '.tar.gz'
echo $noExt

@Brent 2014-02-14 21:28:45

cd $(tar tf $1 | sed -n 1p)

@smilyface 2011-11-21 10:35:44

Here is code with AWK. It can be done more simply. But I am not good in AWK.

filename$ ls
abc.a.txt  a.b.c.txt  pp-kk.txt
filename$ find . -type f | awk -F/ '{print $2}' | rev | awk -F"." '{$1="";print}' | rev | awk 'gsub(" ",".") ,sub(".$", "")'
abc.a
a.b.c
pp-kk
filename$ find . -type f | awk -F/ '{print $2}' | awk -F"." '{print $NF}'
txt
txt
txt

@BHSPitMonkey 2013-04-05 21:13:38

You shouldn't need the first awk statement in the last example, right?

@tripleee 2017-11-10 10:42:01

You can avoid piping Awk to Awk by doing another split(). awk -F / '{ n=split($2, a, "."); print a[n] }' uses /` as the top-level delimiter but then splits the second fields on . and prints the last element from the new array.

@Andrew Woolfgang 2011-09-29 07:26:47

I think that if you just need the name of the file, you can try this:

FULLPATH=/usr/share/X11/xorg.conf.d/50-synaptics.conf

# Remove all the prefix until the "/" character
FILENAME=${FULLPATH##*/}

# Remove all the prefix until the "." character
FILEEXTENSION=${FILENAME##*.}

# Remove a suffix, in our case, the filename. This will return the name of the directory that contains this file.
BASEDIRECTORY=${FULLPATH%$FILENAME}

echo "path = $FULLPATH"
echo "file name = $FILENAME"
echo "file extension = $FILEEXTENSION"
echo "base directory = $BASEDIRECTORY"

And that is all =D.

@Carlos Ricardo 2012-12-09 20:42:56

Just wanted BASEDIRECTORY :) Thanks!

@Kebabbert 2010-07-21 10:24:37

Mellen writes in a comment on a blog post:

Using Bash, there’s also ${file%.*} to get the filename without the extension and ${file##*.} to get the extension alone. That is,

file="thisfile.txt"
echo "filename: ${file%.*}"
echo "extension: ${file##*.}"

Outputs:

filename: thisfile
extension: txt

@mklement0 2014-12-09 15:30:14

@chown 2012-05-19 18:59:03

Using example file /Users/Jonathan/Scripts/bash/MyScript.sh, this code:

MY_EXT=".${0##*.}"
ME=$(/usr/bin/basename "${0}" "${MY_EXT}")

will result in ${ME} being MyScript and ${MY_EXT} being .sh:


Script:

#!/bin/bash
set -e

MY_EXT=".${0##*.}"
ME=$(/usr/bin/basename "${0}" "${MY_EXT}")

echo "${ME} - ${MY_EXT}"

Some tests:

$ ./MyScript.sh 
MyScript - .sh

$ bash MyScript.sh
MyScript - .sh

$ /Users/Jonathan/Scripts/bash/MyScript.sh
MyScript - .sh

$ bash /Users/Jonathan/Scripts/bash/MyScript.sh
MyScript - .sh

@mklement0 2014-12-09 16:35:52

Not sure why this has so many downvotes - it's actually more efficient than the accepted answer. (As the latter, it also breaks with input filenames without an extension). Using an explicit path to basename is, perhaps, overkill.

@historystamp 2014-10-21 21:47:43

Here is the algorithm I used for finding the name and extension of a file when I wrote a Bash script to make names unique when names conflicted with respect to casing.

#! /bin/bash 

#
# Finds 
# -- name and extension pairs
# -- null extension when there isn't an extension.
# -- Finds name of a hidden file without an extension
# 

declare -a fileNames=(
  '.Montreal' 
  '.Rome.txt' 
  'Loundon.txt' 
  'Paris' 
  'San Diego.txt'
  'San Francisco' 
  )

echo "Script ${0} finding name and extension pairs."
echo 

for theFileName in "${fileNames[@]}"
do
     echo "theFileName=${theFileName}"  

     # Get the proposed name by chopping off the extension
     name="${theFileName%.*}"

     # get extension.  Set to null when there isn't an extension
     # Thanks to mklement0 in a comment above.
     extension=$([[ "$theFileName" == *.* ]] && echo ".${theFileName##*.}" || echo '')

     # a hidden file without extenson?
     if [ "${theFileName}" = "${extension}" ] ; then
         # hidden file without extension.  Fixup.
         name=${theFileName}
         extension=""
     fi

     echo "  name=${name}"
     echo "  extension=${extension}"
done 

The test run.

$ config/Name\&Extension.bash 
Script config/Name&Extension.bash finding name and extension pairs.

theFileName=.Montreal
  name=.Montreal
  extension=
theFileName=.Rome.txt
  name=.Rome
  extension=.txt
theFileName=Loundon.txt
  name=Loundon
  extension=.txt
theFileName=Paris
  name=Paris
  extension=
theFileName=San Diego.txt
  name=San Diego
  extension=.txt
theFileName=San Francisco
  name=San Francisco
  extension=
$ 

FYI: The complete transliteration program and more test cases can be found here: https://www.dropbox.com/s/4c6m0f2e28a1vxf/avoid-clashes-code.zip?dl=0

@f0nzie 2020-05-25 19:38:22

From all the solutions this is the only one that returns an empty string when the file has no extension with: extension=$([[ "$theFileName" == *.* ]] && echo ".${theFileName##*.}" || echo '')

@Srikant 2014-07-30 08:11:12

A simple bash one liner. I used this to remove rst extension from all files in pwd

for each in `ls -1 *.rst`
do
     a=$(echo $each | wc -c)
     echo $each | cut -c -$(( $a-5 )) >> blognames
done

What it does ?

1) ls -1 *.rst will list all the files on stdout in new line (try).

2) echo $each | wc -c counts the number of characters in each filename .

3) echo $each | cut -c -$(( $a-5 )) selects up to last 4 characters, i.e, .rst.

@Vytenis Bivainis 2014-05-22 20:24:16

You can use

sed 's/^/./' | rev | cut -d. -f2- | rev | cut -c2-

to get file name and

sed 's/^/./' | rev | cut -d. -f1  | rev

to get extension.

Test case:

echo "filename.gz"     | sed 's/^/./' | rev | cut -d. -f2- | rev | cut -c2-
echo "filename.gz"     | sed 's/^/./' | rev | cut -d. -f1  | rev
echo "filename"        | sed 's/^/./' | rev | cut -d. -f2- | rev | cut -c2-
echo "filename"        | sed 's/^/./' | rev | cut -d. -f1  | rev
echo "filename.tar.gz" | sed 's/^/./' | rev | cut -d. -f2- | rev | cut -c2-
echo "filename.tar.gz" | sed 's/^/./' | rev | cut -d. -f1  | rev

@Dennis 2014-03-30 21:41:03

How to extract the filename and extension in fish:

function split-filename-extension --description "Prints the filename and extension"
  for file in $argv
    if test -f $file
      set --local extension (echo $file | awk -F. '{print $NF}')
      set --local filename (basename $file .$extension)
      echo "$filename $extension"
    else
      echo "$file is not a valid file"
    end
  end
end

Caveats: Splits on the last dot, which works well for filenames with dots in them, but not well for extensions with dots in them. See example below.

Usage:

$ split-filename-extension foo-0.4.2.zip bar.tar.gz
foo-0.4.2 zip  # Looks good!
bar.tar gz  # Careful, you probably want .tar.gz as the extension.

There's probably better ways to do this. Feel free to edit my answer to improve it.


If there's a limited set of extensions you'll be dealing with and you know all of them, try this:

switch $file
  case *.tar
    echo (basename $file .tar) tar
  case *.tar.bz2
    echo (basename $file .tar.bz2) tar.bz2
  case *.tar.gz
    echo (basename $file .tar.gz) tar.gz
  # and so on
end

This does not have the caveat as the first example, but you do have to handle every case so it could be more tedious depending on how many extensions you can expect.

Related Questions

Sponsored Content

35 Answered Questions

[SOLVED] How can I check if a directory exists in a Bash shell script?

  • 2008-09-12 20:06:25
  • Grundlefleck
  • 2839626 View
  • 3806 Score
  • 35 Answer
  • Tags:   bash shell unix posix

30 Answered Questions

[SOLVED] How to concatenate string variables in Bash

67 Answered Questions

[SOLVED] How to get the source directory of a Bash script from within the script itself?

  • 2008-09-12 20:39:56
  • Jiaaro
  • 1793186 View
  • 5110 Score
  • 67 Answer
  • Tags:   bash directory

37 Answered Questions

[SOLVED] How can I check if a program exists from a Bash script?

  • 2009-02-26 21:52:49
  • gregh
  • 728727 View
  • 2307 Score
  • 37 Answer
  • Tags:   bash

24 Answered Questions

[SOLVED] Extracting extension from filename in Python

20 Answered Questions

[SOLVED] How do I tell if a regular file does not exist in Bash?

  • 2009-03-12 14:48:43
  • Bill the Lizard
  • 2573694 View
  • 3339 Score
  • 20 Answer
  • Tags:   bash file-io scripting

26 Answered Questions

[SOLVED] How to check if a string contains a substring in Bash

23 Answered Questions

[SOLVED] Echo newline in Bash prints literal \n

  • 2011-12-11 21:01:54
  • Sergey
  • 2253387 View
  • 2509 Score
  • 23 Answer
  • Tags:   bash echo newline

20 Answered Questions

[SOLVED] How to convert a string to lower case in Bash?

32 Answered Questions

[SOLVED] How do I split a string on a delimiter in Bash?

Sponsored Content