http://blog.codingmylife.com/?p=23
今天在尝试抓取起点中文网首页的时候遇到了一个问题 — 如果编码没有用对的话是没办法读取任何东西的. 这也算是C#用的太多养成的坏习惯, 以前基本没怎么考虑过编码问题. 应该说, C#里面就算编码错了, 也能读进来东西, 只是一片乱码而已. Cocoa里面就狠了点, 直接抛异常了. 下面是刚开始写的一段代码, 把起点中文网的主页下载到一个字符串中.
1
2
3
4
5
6
7
8
9
10
11
12
NSURL *url = [[NSURL alloc] initWithString:@"http://www.cmfu.com"];
NSError *error;
NSString *xml = [NSString stringWithContentsOfURL:url encoding:NSUTF8StringEncoding error:&error];
if(xml == nil)
{
NSLog(@"Error reading url at %@", [error localizedFailureReason]);
}
else
{
[result setString:xml];
}
死活下载失败, 错误信息就是编码不对. 好吧, 我打开了帮助查看了下所有的编码:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
enum {
NSASCIIStringEncoding = 1,
NSNEXTSTEPStringEncoding = 2,
NSJapaneseEUCStringEncoding = 3,
NSUTF8StringEncoding = 4,
NSISOLatin1StringEncoding = 5,
NSSymbolStringEncoding = 6,
NSNonLossyASCIIStringEncoding = 7,
NSShiftJISStringEncoding = 8,
NSISOLatin2StringEncoding = 9,
NSUnicodeStringEncoding = 10,
NSWindowsCP1251StringEncoding = 11,
NSWindowsCP1252StringEncoding = 12,
NSWindowsCP1253StringEncoding = 13,
NSWindowsCP1254StringEncoding = 14,
NSWindowsCP1250StringEncoding = 15,
NSISO2022JPStringEncoding = 21,
NSMacOSRomanStringEncoding = 30,
NSProprietaryStringEncoding = 65536
};
我一个一个的试, 居然全都不行! 崩溃了, 这都什么年代了, 难道Cocoa还不支持中文? 不可能啊. 估计是上面那份文档里面只是列出了最长用的几种编码(这里是苹果认为最长用的, 可见对于中国基本是无视了, 鄙视下!), 我就写了下面这段代码输出了所有支持的编码:
1
2
3
4
5
6
7
8
9
const NSStringEncoding *encodings = [NSString availableStringEncodings];
NSMutableString *str = [[NSMutableString alloc] init];
NSStringEncoding encoding;
while ((encoding = *encodings++) != 0)
{
[str appendFormat: @"%@ === %in", [NSString localizedNameOfStringEncoding:encoding], encoding];
}
[result setString: str];
好家伙, 果然被我猜中了, 下面就是所有支持的编码列表
Western (Mac OS Roman) === 30 Japanese (Mac OS) === -2147483647 Traditional Chinese (Mac OS) === -2147483646 Korean (Mac OS) === -2147483645 Arabic (Mac OS) === -2147483644 Hebrew (Mac OS) === -2147483643 Greek (Mac OS) === -2147483642 Cyrillic (Mac OS) === -2147483641 Devanagari (Mac OS) === -2147483639 Gurmukhi (Mac OS) === -2147483638 Gujarati (Mac OS) === -2147483637 Thai (Mac OS) === -2147483627 Simplified Chinese (Mac OS) === -2147483623 Tibetan (Mac OS) === -2147483622 Central European (Mac OS) === -2147483619 Symbol (Mac OS) === 6 Dingbats (Mac OS) === -2147483614 Turkish (Mac OS) === -2147483613 Croatian (Mac OS) === -2147483612 Icelandic (Mac OS) === -2147483611 Romanian (Mac OS) === -2147483610 Celtic (Mac OS) === -2147483609 Gaelic (Mac OS) === -2147483608 Keyboard Symbols (Mac OS) === -2147483607 Farsi (Mac OS) === -2147483508 Cyrillic (Mac OS Ukrainian) === -2147483496 Inuit (Mac OS) === -2147483412 Unicode (UTF-32LE) === -1677721344 Unicode (UTF-8) === 4 Unicode (UTF-16) === 10 Unicode (UTF-16BE) === -1879047936 Unicode (UTF-16LE) === -1811939072 Unicode (UTF-32) === -1946156800 Unicode (UTF-32BE) === -1744830208 Western (ISO Latin 1) === 5 Central European (ISO Latin 2) === 9 Western (ISO Latin 3) === -2147483133 Central European (ISO Latin 4) === -2147483132 Cyrillic (ISO 8859-5) === -2147483131 Arabic (ISO 8859-6) === -2147483130 Greek (ISO 8859-7) === -2147483129 Hebrew (ISO 8859-8) === -2147483128 Turkish (ISO Latin 5) === -2147483127 Nordic (ISO Latin 6) === -2147483126 Thai (ISO 8859-11) === -2147483125 Baltic Rim (ISO Latin 7) === -2147483123 Celtic (ISO Latin === -2147483122 Western (ISO Latin 9) === -2147483121 Romanian (ISO Latin 10) === -2147483120 Latin-US (DOS) === -2147482624 Greek (DOS) === -2147482619 Baltic Rim (DOS) === -2147482618 Western (DOS Latin 1) === -2147482608 Greek (DOS Greek 1) === -2147482607 Central European (DOS Latin 2) === -2147482606 Cyrillic (DOS) === -2147482605 Turkish (DOS) === -2147482604 Portuguese (DOS) === -2147482603 Icelandic (DOS) === -2147482602 Hebrew (DOS) === -2147482601 Canadian French (DOS) === -2147482600 Arabic (DOS) === -2147482599 Nordic (DOS) === -2147482598 Cyrillic (DOS) === -2147482597 Greek (DOS Greek 2) === -2147482596 Thai (Windows, DOS) === -2147482595 Japanese (Windows, DOS) === 8 Simplified Chinese (Windows, DOS) === -2147482591 Korean (Windows, DOS) === -2147482590 Traditional Chinese (Windows, DOS) === -2147482589 Western (Windows Latin 1) === 12 Central European (Windows Latin 2) === 15 Cyrillic (Windows) === 11 Greek (Windows) === 13 Turkish (Windows Latin 5) === 14 Hebrew (Windows) === -2147482363 Arabic (Windows) === -2147482362 Baltic Rim (Windows) === -2147482361 Vietnamese (Windows) === -2147482360 Western (ASCII) === 1 Japanese (Shift JIS X0213) === -2147482072 Chinese (GBK) === -2147482063 Chinese (GB 18030) === -2147482062 Japanese (ISO 2022-JP) === 21 Korean (ISO 2022-KR) === -2147481536 Japanese (EUC) === 3 Simplified Chinese (EUC) === -2147481296 Traditional Chinese (EUC) === -2147481295 Korean (EUC) === -2147481280 Japanese (Shift JIS) === -2147481087 Cyrillic (KOI8-R) === -2147481086 Traditional Chinese (Big 5) === -2147481085 Western (Mac Mail) === -2147481084 Simplified Chinese (HZ GB 2312) === -2147481083 Traditional Chinese (Big 5 HKSCS) === -2147481082 Ukrainian (KOI8-U) === -2147481080 Traditional Chinese (Big 5-E) === -2147481079 Western (NextStep) === 2 Non-lossy ASCII === 7 Western (EBCDIC Latin 1) === -2147480574
终于看到了熟悉的 GBK 编码, 对应的代码是 -2147482063. Ok, 更改一下最开始的代码
1
2
3
4
5
6
7
8
9
10
11
12
13
NSURL *url = [[NSURL alloc] initWithString:@"http://www.cmfu.com"];
NSError *error;
NSStringEncoding encoder;
NSString *xml = [NSString stringWithContentsOfURL:url encoding:encoder=-2147482063 error:&error];
if(xml == nil)
{
NSLog(@"Error reading url at %@", [error localizedFailureReason]);
}
else
{
[result setString:xml];
}
终于搞定了! 看到熟悉的中文真是激动了.
posted under Cocoa
One Comment to
“读取任意编码的文件.”
On August 21st, 2007 at 10:05 am
Glider Says:
如果用CoreFoundation的Framework似乎就没有那么复杂了,直接用CFStringCreateWithBytes(),参数带kCFStringEncodgingGB180302000就可以了。
Name:
Email: Email will not be published
Website Address: Website example
Your Comment:
没有合适的资源?快使用搜索试试~ 我知道了~
iphone的一点入门文章
共18个文件
doc:15个
txt:2个
w:1个
需积分: 3 9 下载量 21 浏览量
2011-05-20
09:21:02
上传
评论
收藏 2.47MB ZIP 举报
温馨提示
关于iphone开发的入门文章。。。。。。。。。
资源推荐
资源详情
资源评论
收起资源包目录
入门文章.zip (18个子文件)
入门文章
iPhone
【iPhone开发资料】iPhone Events 2(双指事件).doc 42KB
如何让你的iPhone程序支持多语言环境.doc 36KB
iPhone官方SDK度量转换程序.doc 44KB
iPhone入门开发推荐阅读的一点资料.txt 1KB
比较官方iPhone SDK和开源工具链.doc 41KB
iPhone的特殊URL.doc 26KB
iPhone入门.doc 758KB
iPhone官方SDK用于读写临时数据的方法.doc 37KB
Objective-C
初学者:介绍NSLog的使用.doc 29KB
将时间格式显示为更易读.doc 24KB
Objective-C 2.0之前需要了解的:关于Obj-C内存管理的规则.doc 55KB
Objective-C 快速入门.doc 138KB
Cocoa教学:Windows OOP与Cocoa MVC之对比.doc 610KB
Objective-C的Initialize初始化方法研究.doc 209KB
读取任意编码的文件.txt 6KB
关于Objective-C 2.0 的垃圾收集.doc 104KB
游戏
W 3KB
Mac Leopard
leopard.doc 956KB
共 18 条
- 1
资源评论
ysb0234yang
- 粉丝: 7
- 资源: 28
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功