CSV Format Encoder and Decoder
Make use of Comma-separated values file format.
by Liu Junfeng @ 2008-1-15
The CSV (Comma Separated Values) file format is often used to exchange data between disparate applications.
CSV has much lower overhead, thereby using much less bandwidth and storage than XML.
Many informal documents exist that describe the CSV format. There are differences in how special characters are handled.
Here I propose a solution that most people will agree with.
CSV data contains a list of records and a record contains a list of fields.
Records don't required to have the same number of fields.
Basic rules:
(1) Fields are separated with commas.
(2) Each record occupies just one line.
Extended rules:
(3) Padding spaces can be added ahead of a field.
(4) Fields may always be delimited with double quotes.
(5) The first record may be a record of column names.
Sepcial rules:
(6) If a field value contains leading or trailing space or sepcial chars of comma, double-quote or linebreak, it must be enclosed by double-quotes.
(7) Within double-quoted string, \\, \", \r, \n, \t are treated as escape sequence.
(8) Empty string doublequoted, null string doesn't.
Storage rules:
(9) The text is treate as unicode, load and save to file using a specific encoding.
Usually UTF8 or UTF16 can be used. The byte order mark of UTF8 is optional, of UTF16 is required.
Padding spaces can be used to align fields to the same column.
To stick to the basic rules, sepcial chars need be handled by sepcial rules.
Grammar of CSV expressed in PEG:
CsvData <- Record* EndOfFile
Record <- Field (Separator Field)* EnfOfLine
Field <- Spacing (UnQuotedText / QuotedText)
UnQuotedText <- (-",\"\r\n")*
QuotedText <- '"' (-"\"\r\n\\" / EscapeSequence)* '"'
EscapeSequence <- '\\\\' / '\\"' / '\\r' / '\\n' / '\\t'
Spacing <- Space*
Space <- ' ' / '\t'
Separator <- ','
EnfOfLine <- '\r\n' / '\r' / '\n'
EndOfFile <- <end>
Here ",\"\r\n" means a char set, -",\"\r\n" means a complement char set.
According to this grammar, only leading spaces are ignored and each record must end with line break chars.
These both simplify the grammar and have no hurt to the formatting style.
Other questions:
1.How to encode binary data?
CSV is not a suitable format to store large block of binary data.
For small binary fields, they can be converted to text using Bin2Hex, Base64, etc.
2.How to encode multiple tables?
A blank line is read as a record with field having null value.
Normally a table has more than one columns, so blank lines can be used to separate tables.
These features can be handled by user of CSV encoder/decoder.
没有合适的资源?快使用搜索试试~ 我知道了~
群发源代码大放送了之-liba篱笆网
共103个文件
cs:60个
resx:17个
bmp:15个
4星 · 超过85%的资源 需积分: 48 12 下载量 64 浏览量
2012-11-13
20:23:05
上传
评论
收藏 1.11MB RAR 举报
温馨提示
群发源代码大放送了 硬盘代码太过了,以前做的老东西想删除了,就上传网上共享吧。 原创,完全开源,我N年前做的东西了, 使用了Castle.ActiveRecord +Spring.NET+ log4net+CSV+NHibernate+ SmartThreadPool+SQLite.NET. 还有动态代理切换、易思验证码等技术。 菜鸟们可以参考。有问题请联系我上海Q-362,505,707。
资源推荐
资源详情
资源评论
收起资源包目录
群发源代码大放送了之-liba篱笆网 (103个子文件)
right_arrow.bmp 3KB
italic.bmp 822B
rj.bmp 822B
indent.bmp 822B
ol.bmp 822B
uol.bmp 822B
lj.bmp 822B
cj.bmp 822B
outdent.bmp 822B
underscore.bmp 822B
bold.bmp 822B
link.bmp 822B
fj.bmp 822B
backcolor.bmp 822B
color.bmp 822B
App.config 5KB
Editor.cs 48KB
Helper.cs 39KB
FormProxy.Designer.cs 37KB
FormConfig.Designer.cs 35KB
Editor.Designer.cs 25KB
Plugin_LIBA.cs 22KB
FormAccount.Designer.cs 20KB
FormProxy.cs 20KB
LogicHelper.cs 19KB
EditorForm.Designer.cs 18KB
FormMain.Designer.cs 17KB
FormUser.Designer.cs 16KB
ProxyHelper.cs 15KB
FormThread.Designer.cs 15KB
FormAddress.Designer.cs 15KB
FormForum.Designer.cs 15KB
FormInfo.Designer.cs 14KB
FormMain.cs 14KB
FormAccount.cs 13KB
FormUser.cs 9KB
SearchDialog.Designer.cs 7KB
FormAddress.cs 7KB
FormThread.cs 7KB
DataCrypto.cs 7KB
FormForum.cs 7KB
Resources.Designer.cs 6KB
FormConfig.cs 6KB
FormInfo.cs 6KB
LinkDialog.Designer.cs 6KB
SortableBindingList.cs 5KB
EditorForm.cs 5KB
FormEditor.designer.cs 5KB
Parser.cs 5KB
Class1.cs 5KB
CsvEncoder.cs 5KB
FormEditor2.Designer.cs 5KB
TextInsertForm.Designer.cs 3KB
ParserCommon.cs 3KB
LinkDialog.cs 2KB
Account.cs 2KB
Base.cs 2KB
User.cs 2KB
SearchDialog.cs 2KB
TextInput.cs 2KB
FormEditor.cs 2KB
Settings.Designer.cs 2KB
Program.cs 2KB
FormEditor2.cs 1KB
WProxy.cs 1KB
Info.cs 1KB
AssemblyInfo.cs 1KB
Address.cs 1KB
Program.cs 1KB
CsvDecoder.cs 1KB
TextInsertForm.cs 883B
MyRollingFileAppender.cs 832B
FieldFormatOption.cs 794B
CsvData.cs 440B
ParserInput.cs 344B
CsvRecord.cs 205B
LIBA.csproj 14KB
Copy of my.db 4.01MB
my.db 69KB
Resources.resx 10KB
FormProxy.resx 9KB
FormAccount.resx 9KB
FormUser.resx 8KB
FormForum.resx 7KB
FormAddress.resx 7KB
FormThread.resx 7KB
FormInfo.resx 7KB
Editor.resx 6KB
EditorForm.resx 6KB
FormMain.resx 6KB
FormEditor.resx 6KB
FormEditor2.resx 6KB
FormConfig.resx 6KB
LinkDialog.resx 6KB
TextInsertForm.resx 6KB
SearchDialog.resx 6KB
Settings.settings 249B
LIBA.sln 897B
LIBA.suo 58KB
readme.txt 3KB
共 103 条
- 1
- 2
资源评论
- d42620472012-12-31无法运行啊,不是完整的
zfrong
- 粉丝: 7121
- 资源: 22
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功