By code_cookies


2014-09-01 13:47:08 8 Comments

I am pulling a JSON file from a site and one of the strings received is:

The Weeknd ‘King Of The Fall’ [Video Premiere] | @TheWeeknd | #SoPhi 

How can I convert things like &#8216 into the correct characters?

I've made a Xcode Playground to demonstrate it:

import UIKit

var error: NSError?
let blogUrl: NSURL = NSURL.URLWithString("http://sophisticatedignorance.net/api/get_recent_summary/")
let jsonData = NSData(contentsOfURL: blogUrl)

let dataDictionary = NSJSONSerialization.JSONObjectWithData(jsonData, options: nil, error: &error) as NSDictionary

var a = dataDictionary["posts"] as NSArray

println(a[0]["title"])

22 comments

@akashivskyy 2014-09-01 14:03:01

There's no straightforward way to do that, but you can use NSAttributedString magic to make this process as painless as possible (be warned that this method will strip all HTML tags as well):

let encodedString = "The Weeknd <em>&#8216;King Of The Fall&#8217;</em>"

// encodedString should = a[0]["title"] in your case

guard let data = htmlEncodedString.data(using: .utf8) else {
    return nil
}

let options: [NSAttributedString.DocumentReadingOptionKey: Any] = [
    .documentType: NSAttributedString.DocumentType.html,
    .characterEncoding: String.Encoding.utf8.rawValue
]

guard let attributedString = try? NSAttributedString(data: data, options: options) else {
    return nil
}

let decodedString = attributedString.string // The Weeknd ‘King Of The Fall’

Remember to initialize NSAttributedString from main thread only. It uses some WebKit magic underneath, thus the requirement.


You can create your own String extension to increase reusability:

extension String {

    init?(htmlEncodedString: String) {

        guard let data = htmlEncodedString.data(using: .utf8) else {
            return nil
        }

        let options: [String: Any] = [
            NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType,
            NSCharacterEncodingDocumentAttribute: String.Encoding.utf8.rawValue
        ]

        guard let attributedString = try? NSAttributedString(data: data, options: options, documentAttributes: nil) else {
            return nil
        }

        self.init(attributedString.string)
    }

}


let encodedString = "The Weeknd <em>&#8216;King Of The Fall&#8217;</em>"
let decodedString = String(htmlEncodedString: encodedString)

@zaph 2014-09-01 14:30:49

+1 for the answer, -1 for preferring an extension over a method. It will not be clear to the next developer that stringByConvertingFromHTML is an extension, clarity is the single most important attribute a program can have.

@akashivskyy 2014-09-01 15:16:47

You're right, stringByConvertingFromHTML sounds a lot like a class func. I altered the example to use a custom init method instead.

@zaph 2014-09-01 17:18:39

That misses the point, why add a method to an Apple API when it is more clear just to use a class function. Sure this is just one--until lots of developers start adding extensions, then the confusion really kicks in.

@akashivskyy 2014-09-01 21:24:25

What? Extensions are meant to extend existing types to provide new functionality.

@akashivskyy 2014-09-01 21:26:46

Sure, a number of 1k+ extensions may become more confusing, but that's not going to happen.

@akashivskyy 2014-09-01 21:29:06

I understand what you're trying to say, but negating extensions isn't the way to go.

@zaph 2014-09-01 23:02:32

One should not use an extension on ann Apple API unless it makes the code more clear for the next developer. It should not cause a WTF moment. YMMV.

@Dam 2014-12-10 15:07:10

I have to use UTF16 to get the same string, but it is working the same as gtm_stringByUnescapingFromHTML in Objective-C. The only problem is that it take some much longer to compute the change that i can't use it in my project. Any idea why it take so long ?

@Martin R 2015-04-25 21:26:26

@akashivskyy: To make this work correctly with non-ASCII characters you have to add an NSCharacterEncodingDocumentAttribute, compare stackoverflow.com/a/27898167/1187415.

@akashivskyy 2015-04-25 22:28:52

@MartinR I didn't know it was required, thanks! Edited my answer.

@Martin R 2015-04-25 22:40:01

I have applied small changes to make it compile with Swift 1.2.

@Guido Lodetti 2015-09-02 09:58:49

This method is extremely heavy and is not recommended in tableviews or gridviews

@ekill 2015-10-16 21:31:54

i was happy with this for most of this week until i ran the ios8 sim and saw that it was unsuitably slow

@Muruganandham K 2015-11-02 10:16:43

but loading to slow, if use this in a cell. What i have to do?

@akashivskyy 2015-11-02 22:28:45

Pre-render the strings before using them in cellForRowAtIndexPath. Or use CoreText for increased performance. Or use a third-party library which may be faster (if you find one). ;)

@Zaid Pathan 2015-12-12 21:42:16

Swift 2 version of the extension: stackoverflow.com/a/34245313/3411787

@Andrew Johnson 2016-03-25 19:41:06

One way to prevent confusion when extending API's is to take advantage of Xcode's syntax coloring and have 'Project' items a different color from the 'Other' items.

@Kirill 2016-10-18 16:15:13

Please, can you add Swift 3 version

@MMV 2018-03-17 19:30:01

This is great! Although it blocks the main thread, is there any way to run it in the background thread?

@Leo Dabus 2019-02-15 16:57:44

It would be better to propagate the error making the initializer throw the NSAttributedString error and also allow the user to pass the data directly to the initialiser. Then you can also add a html string initializer that calls the data initializer.

@Oded Regev 2019-01-08 10:12:42

Objective-C

+(NSString *) decodeHTMLEnocdedString:(NSString *)htmlEncodedString {
    if (!htmlEncodedString) {
        return nil;
    }

    NSData *data = [htmlEncodedString dataUsingEncoding:NSUTF8StringEncoding];
    NSDictionary *attributes = @{NSDocumentTypeDocumentAttribute:     NSHTMLTextDocumentType,
                             NSCharacterEncodingDocumentAttribute:     @(NSUTF8StringEncoding)};
    NSAttributedString *attributedString = [[NSAttributedString alloc]     initWithData:data options:attributes documentAttributes:nil error:nil];
    return [attributedString string];
}

@Vincent 2018-12-10 16:56:06

Swift4

I really like the solution using documentAttributes however it is may to slow for parsing files and/or usage in table view cells. I can;t believe that Apple does not provide a decent solution for this.

As a workaround, I found on GitHub this String Extension which works perfectly and fast for decoding.

So for situations in which the given answer is to slow see the solution suggest in this link: https://gist.github.com/mwaterfall/25b4a6a06dc3309d9555 Note: it does not parse HTML tags.

@pipizanzibar 2017-09-30 09:16:14

Swift 4 Version

extension String {

init(htmlEncodedString: String) {
    self.init()
    guard let encodedData = htmlEncodedString.data(using: .utf8) else {
        self = htmlEncodedString
        return
    }

    let attributedOptions: [NSAttributedString.DocumentReadingOptionKey : Any] = [
        .documentType: NSAttributedString.DocumentType.html,
        .characterEncoding: String.Encoding.utf8.rawValue
    ]

    do {
        let attributedString = try NSAttributedString(data: encodedData, options: attributedOptions, documentAttributes: nil)
        self = attributedString.string
    } catch {
        print("Error: \(error)")
        self = htmlEncodedString
    }
  }
}

@MickeDG 2017-10-17 16:47:57

I get "Error Domain=NSCocoaErrorDomain Code=259 "The file couldn’t be opened because it isn’t in the correct format."" when I try to use this. This goes away if I run the full do catch on the main thread. I found this from checking the NSAttributedString documentation: "The HTML importer should not be called from a background thread (that is, the options dictionary includes documentType with a value of html). It will try to synchronize with the main thread, fail, and time out."

@vadian 2017-12-04 12:09:55

Please, the rawValue syntax NSAttributedString.DocumentReadingOptionKey(rawValue: NSAttributedString.DocumentAttributeKey.documentType.rawValu‌​e) and NSAttributedString.DocumentReadingOptionKey(rawValue: NSAttributedString.DocumentAttributeKey.characterEncoding.ra‌​wValue) is horrible. Replace it with .documentType and .characterEncoding

@quemeful 2017-11-04 16:02:10

Swift 4

extension String {
    var replacingHTMLEntities: String? {
        do {
            return try NSAttributedString(data: Data(utf8), options: [
                .documentType: NSAttributedString.DocumentType.html,
                .characterEncoding: String.Encoding.utf8.rawValue
            ], documentAttributes: nil).string
        } catch {
            return nil
        }
    }
}

Simple Usage

let clean = "Weeknd &#8216;King Of The Fall&#8217".replacingHTMLEntities ?? "default value"

@quemeful 2017-11-04 16:03:44

I can already hear people complaining about my force unwrapped optional. If you are researching HTML string encoding and you do not know how to deal with Swift optionals, you're too far ahead of yourself.

@quemeful 2018-11-05 12:31:53

yup, there is was (edited Nov 1 at 22:37 and made the "Simple Usage" much harder to comprehend)

@Deepak 2018-10-29 08:52:56

Swift 4.1 +

var htmlDecoded: String {


    let attributedOptions: [NSAttributedString.DocumentReadingOptionKey : Any] = [

        NSAttributedString.DocumentReadingOptionKey.documentType : NSAttributedString.DocumentType.html,
        NSAttributedString.DocumentReadingOptionKey.characterEncoding : String.Encoding.utf8.rawValue
    ]


    let decoded = try? NSAttributedString(data: Data(utf8), options: attributedOptions
        , documentAttributes: nil).string

    return decoded ?? self
} 

@Naishta 2018-08-16 19:44:58

Swift 4:

The total solution that finally worked for me with html code and newline characters and single quotes

extension String {
    var htmlDecoded: String {
        let decoded = try? NSAttributedString(data: Data(utf8), options: [
            .documentType: NSAttributedString.DocumentType.html,
            .characterEncoding: String.Encoding.utf8.rawValue
            ], documentAttributes: nil).string

        return decoded ?? self
    }
}

Usage:

let yourStringEncoded = yourStringWithHtmlcode.htmlDecoded

I then had to apply some more filters to get rid of single quotes ( for eg: don't, hasn't, It's etc), and new line characters like \n

var yourNewString = String(yourStringEncoded.filter { !"\n\t\r".contains($0) })
yourNewString = yourNewString.replacingOccurrences(of: "\'", with: "", options: NSString.CompareOptions.literal, range: nil)

@rmaddy 2018-11-01 22:31:22

This is essentially a copy of this other answer. All you did is add some usage which is obvious enough.

@Naishta 2018-11-02 11:19:10

some one has upvoted this answer and found it really useful, what does that tell you ?

@Haroldo Gondim 2018-08-08 14:10:01

Swift 4

func decodeHTML(string: String) -> String? {

    var decodedString: String?

    if let encodedData = string.data(using: .utf8) {
        let attributedOptions: [NSAttributedString.DocumentReadingOptionKey : Any] = [
            .documentType: NSAttributedString.DocumentType.html,
            .characterEncoding: String.Encoding.utf8.rawValue
        ]

        do {
            decodedString = try NSAttributedString(data: encodedData, options: attributedOptions, documentAttributes: nil).string
        } catch {
            print("\(error.localizedDescription)")
        }
    }

    return decodedString
}

@iLandes 2018-04-30 10:21:31

Elegant Swift 4 Solution

If you want a string

myString = String(htmlString: encodedString)

Add this extension to your project

extension String {

    init(htmlString: String) {
        self.init()
        guard let encodedData = htmlString.data(using: .utf8) else {
            self = htmlString
            return
        }

        let attributedOptions: [NSAttributedString.DocumentReadingOptionKey : Any] = [
           .documentType: NSAttributedString.DocumentType.html,
           .characterEncoding: String.Encoding.utf8.rawValue
        ]

        do {
            let attributedString = try NSAttributedString(data: encodedData,
                                                          options: attributedOptions,
                                                          documentAttributes: nil)
            self = attributedString.string
        } catch {
            print("Error: \(error.localizedDescription)")
            self = htmlString
        }
    }
}

If you want an NSAttributedString with Bold, Italic, Links etc:

textField.attributedText = try? NSAttributedString(htmlString: encodedString)

Add this extension to your project

extension NSAttributedString {

    convenience init(htmlString html: String) throws {
        try self.init(data: Data(html.utf8), options: [
            .documentType: NSAttributedString.DocumentType.html,
            .characterEncoding: String.Encoding.utf8.rawValue
            ], documentAttributes: nil)
    }

}

@Despotovic 2018-03-22 10:21:22

Have a look at HTMLString - a library written in Swift that allows your program to add and remove HTML entities in Strings

For completeness, I copied main features from the site:

  • Adds entities for ASCII and UTF-8/UTF-16 encodings
  • Removes more than 2100 named entities (like &)
  • Supports removing decimal and hexadecimal entities
  • Designed to support Swift Extended Grapheme Clusters (→ 100% emoji-proof)
  • Fully unit tested
  • Fast
  • Documented
  • Compatible with Objective-C

@Martin R 2015-05-09 15:21:36

@akashivskyy's answer is great and demonstrates how to utilize NSAttributedString to decode HTML entities. One possible disadvantage (as he stated) is that all HTML markup is removed as well, so

<strong> 4 &lt; 5 &amp; 3 &gt; 2</strong>

becomes

4 < 5 & 3 > 2

On OS X there is CFXMLCreateStringByUnescapingEntities() which does the job:

let encoded = "<strong> 4 &lt; 5 &amp; 3 &gt; 2 .</strong> Price: 12 &#x20ac;.  &#64; "
let decoded = CFXMLCreateStringByUnescapingEntities(nil, encoded, nil) as String
println(decoded)
// <strong> 4 < 5 & 3 > 2 .</strong> Price: 12 €.  @ 

but this is not available on iOS.

Here is a pure Swift implementation. It decodes character entities references like &lt; using a dictionary, and all numeric character entities like &#64 or &#x20ac. (Note that I did not list all 252 HTML entities explicitly.)

Swift 4:

// Mapping from XML/HTML character entity reference to character
// From http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references
private let characterEntities : [ Substring : Character ] = [
    // XML predefined entities:
    "&quot;"    : "\"",
    "&amp;"     : "&",
    "&apos;"    : "'",
    "&lt;"      : "<",
    "&gt;"      : ">",

    // HTML character entity references:
    "&nbsp;"    : "\u{00a0}",
    // ...
    "&diams;"   : "♦",
]

extension String {

    /// Returns a new string made by replacing in the `String`
    /// all HTML character entity references with the corresponding
    /// character.
    var stringByDecodingHTMLEntities : String {

        // ===== Utility functions =====

        // Convert the number in the string to the corresponding
        // Unicode character, e.g.
        //    decodeNumeric("64", 10)   --> "@"
        //    decodeNumeric("20ac", 16) --> "€"
        func decodeNumeric(_ string : Substring, base : Int) -> Character? {
            guard let code = UInt32(string, radix: base),
                let uniScalar = UnicodeScalar(code) else { return nil }
            return Character(uniScalar)
        }

        // Decode the HTML character entity to the corresponding
        // Unicode character, return `nil` for invalid input.
        //     decode("&#64;")    --> "@"
        //     decode("&#x20ac;") --> "€"
        //     decode("&lt;")     --> "<"
        //     decode("&foo;")    --> nil
        func decode(_ entity : Substring) -> Character? {

            if entity.hasPrefix("&#x") || entity.hasPrefix("&#X") {
                return decodeNumeric(entity.dropFirst(3).dropLast(), base: 16)
            } else if entity.hasPrefix("&#") {
                return decodeNumeric(entity.dropFirst(2).dropLast(), base: 10)
            } else {
                return characterEntities[entity]
            }
        }

        // ===== Method starts here =====

        var result = ""
        var position = startIndex

        // Find the next '&' and copy the characters preceding it to `result`:
        while let ampRange = self[position...].range(of: "&") {
            result.append(contentsOf: self[position ..< ampRange.lowerBound])
            position = ampRange.lowerBound

            // Find the next ';' and copy everything from '&' to ';' into `entity`
            guard let semiRange = self[position...].range(of: ";") else {
                // No matching ';'.
                break
            }
            let entity = self[position ..< semiRange.upperBound]
            position = semiRange.upperBound

            if let decoded = decode(entity) {
                // Replace by decoded character:
                result.append(decoded)
            } else {
                // Invalid entity, copy verbatim:
                result.append(contentsOf: entity)
            }
        }
        // Copy remaining characters to `result`:
        result.append(contentsOf: self[position...])
        return result
    }
}

Example:

let encoded = "<strong> 4 &lt; 5 &amp; 3 &gt; 2 .</strong> Price: 12 &#x20ac;.  &#64; "
let decoded = encoded.stringByDecodingHTMLEntities
print(decoded)
// <strong> 4 < 5 & 3 > 2 .</strong> Price: 12 €.  @

Swift 3:

// Mapping from XML/HTML character entity reference to character
// From http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references
private let characterEntities : [ String : Character ] = [
    // XML predefined entities:
    "&quot;"    : "\"",
    "&amp;"     : "&",
    "&apos;"    : "'",
    "&lt;"      : "<",
    "&gt;"      : ">",

    // HTML character entity references:
    "&nbsp;"    : "\u{00a0}",
    // ...
    "&diams;"   : "♦",
]

extension String {

    /// Returns a new string made by replacing in the `String`
    /// all HTML character entity references with the corresponding
    /// character.
    var stringByDecodingHTMLEntities : String {

        // ===== Utility functions =====

        // Convert the number in the string to the corresponding
        // Unicode character, e.g.
        //    decodeNumeric("64", 10)   --> "@"
        //    decodeNumeric("20ac", 16) --> "€"
        func decodeNumeric(_ string : String, base : Int) -> Character? {
            guard let code = UInt32(string, radix: base),
                let uniScalar = UnicodeScalar(code) else { return nil }
            return Character(uniScalar)
        }

        // Decode the HTML character entity to the corresponding
        // Unicode character, return `nil` for invalid input.
        //     decode("&#64;")    --> "@"
        //     decode("&#x20ac;") --> "€"
        //     decode("&lt;")     --> "<"
        //     decode("&foo;")    --> nil
        func decode(_ entity : String) -> Character? {

            if entity.hasPrefix("&#x") || entity.hasPrefix("&#X"){
                return decodeNumeric(entity.substring(with: entity.index(entity.startIndex, offsetBy: 3) ..< entity.index(entity.endIndex, offsetBy: -1)), base: 16)
            } else if entity.hasPrefix("&#") {
                return decodeNumeric(entity.substring(with: entity.index(entity.startIndex, offsetBy: 2) ..< entity.index(entity.endIndex, offsetBy: -1)), base: 10)
            } else {
                return characterEntities[entity]
            }
        }

        // ===== Method starts here =====

        var result = ""
        var position = startIndex

        // Find the next '&' and copy the characters preceding it to `result`:
        while let ampRange = self.range(of: "&", range: position ..< endIndex) {
            result.append(self[position ..< ampRange.lowerBound])
            position = ampRange.lowerBound

            // Find the next ';' and copy everything from '&' to ';' into `entity`
            if let semiRange = self.range(of: ";", range: position ..< endIndex) {
                let entity = self[position ..< semiRange.upperBound]
                position = semiRange.upperBound

                if let decoded = decode(entity) {
                    // Replace by decoded character:
                    result.append(decoded)
                } else {
                    // Invalid entity, copy verbatim:
                    result.append(entity)
                }
            } else {
                // No matching ';'.
                break
            }
        }
        // Copy remaining characters to `result`:
        result.append(self[position ..< endIndex])
        return result
    }
}

Swift 2:

// Mapping from XML/HTML character entity reference to character
// From http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references
private let characterEntities : [ String : Character ] = [
    // XML predefined entities:
    "&quot;"    : "\"",
    "&amp;"     : "&",
    "&apos;"    : "'",
    "&lt;"      : "<",
    "&gt;"      : ">",

    // HTML character entity references:
    "&nbsp;"    : "\u{00a0}",
    // ...
    "&diams;"   : "♦",
]

extension String {

    /// Returns a new string made by replacing in the `String`
    /// all HTML character entity references with the corresponding
    /// character.
    var stringByDecodingHTMLEntities : String {

        // ===== Utility functions =====

        // Convert the number in the string to the corresponding
        // Unicode character, e.g.
        //    decodeNumeric("64", 10)   --> "@"
        //    decodeNumeric("20ac", 16) --> "€"
        func decodeNumeric(string : String, base : Int32) -> Character? {
            let code = UInt32(strtoul(string, nil, base))
            return Character(UnicodeScalar(code))
        }

        // Decode the HTML character entity to the corresponding
        // Unicode character, return `nil` for invalid input.
        //     decode("&#64;")    --> "@"
        //     decode("&#x20ac;") --> "€"
        //     decode("&lt;")     --> "<"
        //     decode("&foo;")    --> nil
        func decode(entity : String) -> Character? {

            if entity.hasPrefix("&#x") || entity.hasPrefix("&#X"){
                return decodeNumeric(entity.substringFromIndex(entity.startIndex.advancedBy(3)), base: 16)
            } else if entity.hasPrefix("&#") {
                return decodeNumeric(entity.substringFromIndex(entity.startIndex.advancedBy(2)), base: 10)
            } else {
                return characterEntities[entity]
            }
        }

        // ===== Method starts here =====

        var result = ""
        var position = startIndex

        // Find the next '&' and copy the characters preceding it to `result`:
        while let ampRange = self.rangeOfString("&", range: position ..< endIndex) {
            result.appendContentsOf(self[position ..< ampRange.startIndex])
            position = ampRange.startIndex

            // Find the next ';' and copy everything from '&' to ';' into `entity`
            if let semiRange = self.rangeOfString(";", range: position ..< endIndex) {
                let entity = self[position ..< semiRange.endIndex]
                position = semiRange.endIndex

                if let decoded = decode(entity) {
                    // Replace by decoded character:
                    result.append(decoded)
                } else {
                    // Invalid entity, copy verbatim:
                    result.appendContentsOf(entity)
                }
            } else {
                // No matching ';'.
                break
            }
        }
        // Copy remaining characters to `result`:
        result.appendContentsOf(self[position ..< endIndex])
        return result
    }
}

@Michael Waterfall 2015-08-27 17:11:01

This is brilliant, thanks Martin! Here's the extension with the full list of HTML entities: gist.github.com/mwaterfall/25b4a6a06dc3309d9555 I've also slightly adapted it to provide the distance offsets made by the replacements. This allows the correct adjustment of any string attributes or entities that might be affected by these replacements (Twitter entity indices for example).

@Santiago 2015-09-17 22:29:34

@MichaelWaterfall and Martin this is magnific! works like a charm! I update the extension for Swift 2 pastebin.com/juHRJ6au Thanks!

@Matti 2015-09-21 14:12:21

This answer should be preferred and accepted over the accepted one. The accepted answer is impossible to be used for longer texts.

@Adela Chang 2016-04-15 16:33:38

I converted this answer to be compatible with Swift 2 and dumped it in a CocoaPod called StringExtensionHTML for ease of use. Note that Santiago's Swift 2 version fixes the compile time errors, but taking out the strtooul(string, nil, base) entirely will cause the code not to work with numeric character entities and crash when it comes to an entity it doesn't recognize (instead of failing gracefully).

@Martin R 2016-04-15 18:02:20

@AdelaChang: Actually I had converted my answer to Swift 2 already in September 2015. It still compiles without warnings with Swift 2.2/Xcode 7.3. Or are you referring to Michael's version?

@Adela Chang 2016-04-28 20:33:16

@MartinR I was actually referring to Santiago's version up above in pastebin. The first time I saw this answer was long ago, so I must have missed the fact that you updated it, but the errors I was referring to was in the pastebin version and not yours. :)

@Martin R 2016-09-06 08:37:50

@yishus: Thanks for fixing the error in the Swift 3 code! (Previously, I had used strtoul() which silently ignores trailing non-digits.)

@yesthisisjoe 2016-09-18 00:45:31

Thank you for the OSX version. So much easier.

@Kwnstantinos Natsios 2016-10-19 11:33:08

Great answer!!! Thanks a lot :D

@user1118321 2018-02-18 00:19:25

This is a great answer. I did get some errors compiling it with Swift 4.1 in Xcode 9.2. They were easily fixed by the compiler's suggestions, but it might be worth updating one more time.

@Martin R 2018-02-18 18:09:00

@user1118321: Code updated, thanks for letting me know.

@Andrea Mugnaini 2018-05-14 03:17:55

Thanks, with this answer I solved my issues: I had serious performance problems using NSAttributedString.

@AamirR 2017-11-24 22:43:06

Swift 4


  • String extension computed var
  • Without extra guard/do/catch etc...
  • Returns the original strings if decoding fails

extension String {
    var htmlDecoded: String {
        let decoded = try? NSAttributedString(data: Data(utf8), options: [
            .documentType: NSAttributedString.DocumentType.html,
            .characterEncoding: String.Encoding.utf8.rawValue
        ], documentAttributes: nil).string

        return decoded ?? self
    }
}

@Naishta 2018-08-11 15:03:54

Wow ! works right out of the box for Swift 4 !. Usage // let encoded = "The Weeknd &#8216;King Of The Fall&#8217;" let finalString = encoded.htmlDecoded

@Jeremy Hicks 2019-01-30 04:14:11

I love the simplicity of this answer. However, it will cause crashes when run in the background because it tries to run on the main thread.

@Omar Freewan 2017-11-05 08:32:17

SWIFT 4

extension String {

mutating func toHtmlEncodedString() {
    guard let encodedData = self.data(using: .utf8) else {
        return
    }

    let attributedOptions: [NSAttributedString.DocumentReadingOptionKey : Any] = [
        NSAttributedString.DocumentReadingOptionKey(rawValue: NSAttributedString.DocumentAttributeKey.documentType.rawValue): NSAttributedString.DocumentType.html,
        NSAttributedString.DocumentReadingOptionKey(rawValue: NSAttributedString.DocumentAttributeKey.characterEncoding.rawValue): String.Encoding.utf8.rawValue
    ]

    do {
        let attributedString = try NSAttributedString(data: encodedData, options: attributedOptions, documentAttributes: nil)
        self = attributedString.string
    } catch {
        print("Error: \(error)")

    }
}

@vadian 2017-12-04 12:11:58

Please, the rawValue syntax NSAttributedString.DocumentReadingOptionKey(rawValue: NSAttributedString.DocumentAttributeKey.documentType.rawValu‌​e) and NSAttributedString.DocumentReadingOptionKey(rawValue: NSAttributedString.DocumentAttributeKey.characterEncoding.ra‌​wValue) is horrible. Replace it with .documentType and .characterEncoding

@Vincent 2018-02-05 10:19:44

Performance of this solution is horrible. It is maybe okay for separate caes, parsing files is not advised.

@Fangming 2017-07-15 02:44:59

Swift 3.0 version with actual font size conversion

Normally, if you directly convert html to attributed string, the font size is increased. You can try to convert html string to attributed string and back again to see the difference.

Instead, here is the actual size conversion that make sure the font size does not change, by applying the 0.75 ratio on all fonts

extension String {
    func htmlAttributedString() -> NSAttributedString? {
        guard let data = self.data(using: String.Encoding.utf16, allowLossyConversion: false) else { return nil }
        guard let attriStr = try? NSMutableAttributedString(
            data: data,
            options: [NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType],
            documentAttributes: nil) else { return nil }
        attriStr.beginEditing()
        attriStr.enumerateAttribute(NSFontAttributeName, in: NSMakeRange(0, attriStr.length), options: .init(rawValue: 0)) {
            (value, range, stop) in
            if let font = value as? UIFont {
                let resizedFont = font.withSize(font.pointSize * 0.75)
                attriStr.addAttribute(NSFontAttributeName,
                                         value: resizedFont,
                                         range: range)
            }
        }
        attriStr.endEditing()
        return attriStr
    }
}

@Geva 2017-02-23 08:09:22

Computed var version of @yishus' answer

public extension String {
    /// Decodes string with html encoding.
    var htmlDecoded: String {
        guard let encodedData = self.data(using: .utf8) else { return self }

        let attributedOptions: [String : Any] = [
            NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType,
            NSCharacterEncodingDocumentAttribute: String.Encoding.utf8.rawValue]

        do {
            let attributedString = try NSAttributedString(data: encodedData,
                                                          options: attributedOptions,
                                                          documentAttributes: nil)
            return attributedString.string
        } catch {
            print("Error: \(error)")
            return self
        }
    }
}

@ravalboy 2017-02-10 11:15:05

Updated answer working on Swift 3

    extension String {
        init?(htmlEncodedString: String) {
            let encodedData = htmlEncodedString.data(using: String.Encoding.utf8)!
            let attributedOptions = [ NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType]

            guard let attributedString = try? NSAttributedString(data: encodedData, options: attributedOptions, documentAttributes: nil) else {
                return nil
           }
            self.init(attributedString.string)
}

@Youming Lin 2016-09-29 15:50:15

I was looking for a pure Swift 3.0 utility to escape to/unescape from HTML character references (i.e. for server-side Swift apps on both macOS and Linux) but didn't find any comprehensive solutions, so I wrote my own implementation: https://github.com/IBM-Swift/swift-html-entities

The package, HTMLEntities, works with HTML4 named character references as well as hex/dec numeric character references, and it will recognize special numeric character references per the W3 HTML5 spec (i.e. &#x80; should be unescaped as the Euro sign (unicode U+20AC) and NOT as the unicode character for U+0080, and certain ranges of numeric character references should be replaced with the replacement character U+FFFD when unescaping).

Usage example:

import HTMLEntities

// encode example
let html = "<script>alert(\"abc\")</script>"

print(html.htmlEscape())
// Prints ”&lt;script&gt;alert(&quot;abc&quot;)&lt;/script&gt;"

// decode example
let htmlencoded = "&lt;script&gt;alert(&quot;abc&quot;)&lt;/script&gt;"

print(htmlencoded.htmlUnescape())
// Prints ”<script>alert(\"abc\")</script>"

And for OP's example:

print("The Weeknd &#8216;King Of The Fall&#8217; [Video Premiere] | @TheWeeknd | #SoPhi ".htmlUnescape())
// prints "The Weeknd ‘King Of The Fall’ [Video Premiere] | @TheWeeknd | #SoPhi "

Edit: HTMLEntities now supports HTML5 named character references as of version 2.0.0. Spec-compliant parsing is also implemented.

@Stéphane Copin 2017-11-06 14:13:36

This is the most generic answer that works all the time, and not requiring being run on the main thread. This will work even with the most complex HTML escaped unicode strings (such as (&nbsp;͡&deg;&nbsp;͜ʖ&nbsp;͡&deg;&nbsp;)), whereas none of the other answers manage that.

@yishus 2016-09-06 08:39:49

Swift 3 version of @akashivskyy's extension,

extension String {
    init(htmlEncodedString: String) {
        self.init()
        guard let encodedData = htmlEncodedString.data(using: .utf8) else {
            self = htmlEncodedString
            return
        }

        let attributedOptions: [String : Any] = [
            NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType,
            NSCharacterEncodingDocumentAttribute: String.Encoding.utf8.rawValue
        ]

        do {
            let attributedString = try NSAttributedString(data: encodedData, options: attributedOptions, documentAttributes: nil)
            self = attributedString.string
        } catch {
            print("Error: \(error)")
            self = htmlEncodedString
        }
    }
}

@Geoherna 2016-11-24 08:10:01

Works great. Original answer was causing weird crash. Thanks for update!

@iLandes 2017-08-29 10:13:27

For french characters I have to use utf16

@Zaid Pathan 2015-12-12 21:41:40

Swift 2 version of @akashivskyy's extension,

 extension String {
     init(htmlEncodedString: String) {
         if let encodedData = htmlEncodedString.dataUsingEncoding(NSUTF8StringEncoding){
             let attributedOptions : [String: AnyObject] = [
            NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType,
            NSCharacterEncodingDocumentAttribute: NSUTF8StringEncoding
        ]

             do{
                 if let attributedString:NSAttributedString = try NSAttributedString(data: encodedData, options: attributedOptions, documentAttributes: nil){
                     self.init(attributedString.string)
                 }else{
                     print("error")
                     self.init(htmlEncodedString)     //Returning actual string if there is an error
                 }
             }catch{
                 print("error: \(error)")
                 self.init(htmlEncodedString)     //Returning actual string if there is an error
             }

         }else{
             self.init(htmlEncodedString)     //Returning actual string if there is an error
         }
     }
 }

@oyalhi 2016-04-25 10:08:39

This code is incomplete and should be avoided by all means. The error is not being handled properly. When there is in fact an error code would crash. You should update your code to at least return nil when there is an error. Or you could just init with original string. In the end you should handle the error. Which is not the case. Wow!

@Zaid Pathan 2016-05-31 20:51:33

@oyalhi updated.

@Yogesh shelke 2016-03-03 06:40:51

NSData dataRes = (nsdata value )

var resString = NSString(data: dataRes, encoding: NSUTF8StringEncoding)

@Bseaborn 2015-10-27 16:50:34

This would be my approach. You could add the entities dictionary from https://gist.github.com/mwaterfall/25b4a6a06dc3309d9555 Michael Waterfall mentions.

extension String {
    func htmlDecoded()->String {

        guard (self != "") else { return self }

        var newStr = self

        let entities = [
            "&quot;"    : "\"",
            "&amp;"     : "&",
            "&apos;"    : "'",
            "&lt;"      : "<",
            "&gt;"      : ">",
        ]

        for (name,value) in entities {
            newStr = newStr.stringByReplacingOccurrencesOfString(name, withString: value)
        }
        return newStr
    }
}

Examples used:

let encoded = "this is so &quot;good&quot;"
let decoded = encoded.htmlDecoded() // "this is so "good""

OR

let encoded = "this is so &quot;good&quot;".htmlDecoded() // "this is so "good""

@jrmgx 2015-11-02 13:27:58

I don't quite like this but I did not find anything better yet so this is an updated version of Michael Waterfall solution for Swift 2.0 gist.github.com/jrmgx/3f9f1d330b295cf6b1c6

@wLc 2015-09-01 16:48:01

extension String{
    func decodeEnt() -> String{
        let encodedData = self.dataUsingEncoding(NSUTF8StringEncoding)!
        let attributedOptions : [String: AnyObject] = [
            NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType,
            NSCharacterEncodingDocumentAttribute: NSUTF8StringEncoding
        ]
        let attributedString = NSAttributedString(data: encodedData, options: attributedOptions, documentAttributes: nil, error: nil)!

        return attributedString.string
    }
}

let encodedString = "The Weeknd &#8216;King Of The Fall&#8217;"

let foo = encodedString.decodeEnt() // The Weeknd ‘King Of The Fall’

Related Questions

Sponsored Content

43 Answered Questions

[SOLVED] How do I test for an empty JavaScript object?

  • 2009-03-25 01:39:45
  • falmp
  • 1628334 View
  • 2283 Score
  • 43 Answer
  • Tags:   javascript json

18 Answered Questions

[SOLVED] #pragma mark in Swift?

  • 2014-06-03 14:05:56
  • Arbitur
  • 203087 View
  • 873 Score
  • 18 Answer
  • Tags:   swift

18 Answered Questions

54 Answered Questions

[SOLVED] How can I pretty-print JSON in a shell script?

15 Answered Questions

[SOLVED] How to call Objective-C code from Swift

  • 2014-06-02 20:05:42
  • David Mulder
  • 260651 View
  • 894 Score
  • 15 Answer
  • Tags:   objective-c swift

39 Answered Questions

[SOLVED] How do I format a Microsoft JSON date?

23 Answered Questions

[SOLVED] How can I pretty-print JSON using JavaScript?

9 Answered Questions

[SOLVED] Adding HTML entities using CSS content

1 Answered Questions

[SOLVED] NSURL problems in Swift

  • 2015-09-28 04:13:50
  • Ethan Marcus
  • 361 View
  • 0 Score
  • 1 Answer
  • Tags:   ios swift nsurl

1 Answered Questions

[SOLVED] how to use JsonArray out of Queue

Sponsored Content