diff --git a/HRInfo.md b/HRInfo.md
new file mode 100644
index 0000000..6763dd9
--- /dev/null
+++ b/HRInfo.md
@@ -0,0 +1,469 @@
+﻿# 2020年春招各大公司招聘内推信息（持续更新-3月17日）
+
+**update：2020年4月9日21:37:03**
+
+注意：信息来源参考牛客，这里只是简单汇总，请自行甄别。
+大概是按照招聘发布时间顺序汇总，有些招聘有截止日期请自行注意。
+
+### 更新不易，点赞支持！收藏可以跟踪最新招聘动态！
+### 欢迎关注公众号：TALKDATA   获取更多面试技巧！
+### B站搜索：TALKDATA 有大量面经视频
+
+----------
+
+## 各大互联网公司招聘官网
+https://zhuanlan.zhihu.com/p/97099493
+
+## 下面包括春招、补录、21届实习
+
+### 【B站内推】哔哩哔哩2020春季校园招聘正式启动！  2020-4-9
+https://www.nowcoder.com/discuss/371425?type=7&order=0&pos=81&page=7
+
+### 【趋势科技】2021届暑期提前批Offer+计划内推码 9246   2020-4-9
+https://www.nowcoder.com/discuss/383040?type=7&order=0&pos=79&page=4
+
+### 【远景2021春招实习】秒推！HC充足！需要推一把上岸的来   2020-4-9
+https://www.nowcoder.com/discuss/371672?type=7&order=0&pos=54&page=15
+
+### 【拼多多】2020暑期实习内推，快上车！！！  2020-4-9
+https://www.nowcoder.com/discuss/393350?type=7&order=0&pos=48&page=11
+
+### 【网易互娱】21届实习内推超！简！单！（4.13开放）  2020-4-9
+https://www.nowcoder.com/discuss/403853?type=7&order=0&pos=23&page=1
+
+### 【图森未来】2020春招 【免笔试/全职/实习】  2020-4-9
+https://www.nowcoder.com/discuss/380005?type=7&order=0&pos=20&page=12
+
+### 【CVTE视源股份】2021长期实习生招聘简章【实习生招聘】  2020-4-9
+https://www.nowcoder.com/discuss/368650?type=7&order=0&pos=17&page=3
+
+### 【快手】校招内推，快手实习内推，快手春招补录内推    2020-4-9
+https://www.nowcoder.com/discuss/398991?type=7&order=0&pos=13&page=2
+
+### 【阿里巴巴】淘宝消息中台-Java软件工程师   2020-4-9
+https://www.nowcoder.com/discuss/403026?type=7&order=3&pos=187&page=0
+
+### 【Shopee】2020年Shopee实习正式启动   2020-4-9
+https://www.nowcoder.com/discuss/403062?type=7&order=3&pos=178&page=0
+
+### 【阿里云】Java开发工程师【可转正】  2020-4-9
+https://www.nowcoder.com/discuss/403622?type=7&order=3&pos=95&page=0
+
+### 【百度】部门直推/公司内推，各种岗位【社招】  2020-4-9
+https://www.nowcoder.com/discuss/403663?type=7&order=3&pos=90&page=0
+
+### 【阿里巴巴】2021届实习生核心又好进的部门内推啦！  2020-4-9
+https://www.nowcoder.com/discuss/403666?type=7&order=3&pos=88&page=1
+
+### 【度小满】金融2020暑期实习生招聘内推   2020-4-9
+https://www.nowcoder.com/discuss/403683?type=7&order=3&pos=82&page=1
+
+### 【阿里云】高可用架构团队实习招聘，100%转正，HC20+   2020-4-9
+https://www.nowcoder.com/discuss/403702?type=7&order=3&pos=80&page=1
+
+### 【京东】2020校招/社招也可内推！内附内推码  2020-4-9
+https://www.nowcoder.com/discuss/403847?type=7&order=3&pos=62&page=1
+
+### 【阿里巴巴】大进口产品技术部门Java实习生2020-4-9
+https://www.nowcoder.com/discuss/404034?type=7&order=3&pos=38&page=0
+
+### 【百度】基础架构部 招后端研发工程师，有意向简历砸过来！！！    2020-4-9
+https://www.nowcoder.com/discuss/404044?type=7&order=3&pos=35&page=0
+
+### 【阿里巴巴企业智能】春招实习末班车啦，需求大，可转正      2020-4-9
+https://www.nowcoder.com/discuss/404051?type=7&order=3&pos=30&page=0
+
+### 【科大讯飞】春招内推火热进行中！！！    2020-3-17
+https://www.nowcoder.com/discuss/384336?type=0&order=undefined&pos=96&page=1
+
+### 【度小满】暑期实习内推       2020-3-17
+https://www.nowcoder.com/discuss/384339?type=0&order=undefined&pos=92&page=1
+
+### 【陌陌】暑期实习生招聘已经开启        2020-3-17
+https://www.nowcoder.com/discuss/374285?type=0&order=undefined&pos=93&page=1
+ 
+### 【招商银行信用卡中心】21/20届免筛选直通面试      2020-3-17
+https://www.nowcoder.com/discuss/384342?type=0&order=undefined&pos=84&page=0
+
+### 【趣头条】 | 技术实习岗 汇总| 还没上车？就差你了       2020-3-17
+https://www.nowcoder.com/discuss/369159?type=0&order=undefined&pos=77&page=1
+ 
+### 【百度】Java研发工程师（春季实习可转正） 2020-3-17
+https://www.nowcoder.com/discuss/375725?type=0&order=undefined&pos=63&page=1
+
+### 【阿里云】视觉智能开放平台        2020-3-17
+https://www.nowcoder.com/discuss/384348?type=7&order=3&pos=30&page=1
+
+### 【阿里云】高可用架构团队暑期实习生招聘      2020-3-17
+https://www.nowcoder.com/discuss/384377?type=7&order=3&pos=23&page=0
+
+### 【蚂蚁金服】【21届实习】简历直达主管投过其他组的还可以投！    2020-3-17
+https://www.nowcoder.com/discuss/384378?type=7&order=3&pos=22&page=1
+
+### 【拼多多】2020届校园招聘 拼越计划-技术精英专场         2020-3-17
+https://www.nowcoder.com/discuss/384387?type=7&order=3&pos=21&page=1
+
+### 【商汤科技】【HR直推】实习岗位大发送  2020-3-7
+https://www.nowcoder.com/discuss/376615?type=7&order=3&pos=164&page=0
+
+### 【腾讯微信事业群WXG暑期实习生内推】可直推、可查进度  2020-3-7
+https://www.nowcoder.com/discuss/376641?type=7&order=3&pos=155&page=1
+
+### 【蘑菇街】内推啦（21实习，20补招限技术岗）  2020-3-7
+https://www.nowcoder.com/discuss/376717?type=7&order=3&pos=136&page=0
+
+### 【4399】春招下周开始笔试！内推码v5okf速度上车  2020-3-7
+https://www.nowcoder.com/discuss/376755?type=7&order=3&pos=128&page=1
+
+### 【蚂蚁金服】-数据平台部-招 Java /算法工程师等 2020-3-7
+https://www.nowcoder.com/discuss/150772
+
+### 【深信服内推】深信服春招开始啦！有求必应  2020-3-7
+https://www.nowcoder.com/discuss/376768?type=7&order=3&pos=124&page=1
+
+### 【美团点评】春招开始了，正文有内推二维码  2020-3-7
+https://www.nowcoder.com/discuss/376851?type=7&order=3&pos=97&page=0
+
+### 【网易】春招21实习+20补招内推，可查进度  2020-3-7
+https://www.nowcoder.com/discuss/376934?type=7&order=3&pos=69&page=0
+
+### 【支付宝】金融核心实习内推----周末不编程师兄等你撩  2020-3-7
+https://www.nowcoder.com/discuss/376935?type=7&order=3&pos=68&page=1
+
+### 【阿里巴巴】-淘宝直播团队春招（与实习生）20届也可   2020-3-7
+https://www.nowcoder.com/discuss/376985?type=7&order=3&pos=48&page=1
+
+### 【阿里巴巴】数据技术及产品部2021实习+校招开始啦   2020-3-7
+https://www.nowcoder.com/discuss/377062?type=7&order=3&pos=17&page=1
+
+### 【猿辅导春招内推】35w校招offer等你来     2020-2-27
+https://www.nowcoder.com/discuss/370740?type=7&order=3&pos=479&page=2
+
+### 【哔哩哔哩】/b站/bilibili 2020届春招/校招内推  2020-2-27
+https://www.nowcoder.com/discuss/370914?type=7&order=3&pos=406&page=7
+
+### 【SmartX】 专注于基础架构领域实习  2020-2-27
+https://www.nowcoder.com/discuss/371031?type=7&order=3&pos=358&page=1
+
+### 【斗鱼直播】20春季补录和21届实习招聘正式启动！！！  2020-2-27
+https://www.nowcoder.com/discuss/371494?type=7&order=3&pos=168&page=1
+
+### 【阿里巴巴天猫国际】2021年毕业实习生提前看过来了  2020-2-27
+https://www.nowcoder.com/discuss/371648?type=7&order=3&pos=108&page=0
+
+### 【远景】2021实习内推 秒推！来不及解释了，快上车！  2020-2-27
+https://www.nowcoder.com/discuss/371672?type=7&order=3&pos=95&page=1
+
+### 【携程内推】内推2020春招   2020-2-27
+https://www.nowcoder.com/discuss/371800?type=7&order=3&pos=63&page=1
+
+### 【美团点评】基础研发平台暑期实习生组内直招(岗位：测试开发、开发  2020-2-27
+https://www.nowcoder.com/discuss/371810?type=7&order=3&pos=60&page=0
+
+### 【网易互娱】20春招补招/21暑期实习来了   2020-2-27
+https://www.nowcoder.com/discuss/371825?type=7&order=3&pos=57&page=1
+
+### 【科大讯飞】春招内推   2020-2-27
+https://www.nowcoder.com/discuss/371832?type=7&order=3&pos=53&page=0
+
+### 【蚂蚁金服oceanbase】团队21届实习招聘   2020-2-27
+https://www.nowcoder.com/discuss/371861?type=7&order=3&pos=47&page=1
+
+### 【蚂蚁金服】国际事业群实习生及校园招聘   2020-2-27
+https://www.nowcoder.com/discuss/371906?type=7&order=3&pos=37&page=1
+
+### 【SHEIN】 跨境电商独角兽 2020春招内推 帮查进度  2020-2-27
+https://www.nowcoder.com/discuss/371969?type=7&order=3&pos=14&page=0
+
+### 【蚂蚁金服】2021届实习-免笔试-可转正-Java开发  2020-2-27
+https://www.nowcoder.com/discuss/371979?type=7&order=3&pos=10&page=1
+
+### 【阿里巴巴新零售供应链平台事业部】校招+社招   2020-2-27
+https://www.nowcoder.com/discuss/371991?type=7&order=3&pos=7&page=1
+
+### 【蚂蚁金服Oceanbase团队】实习生招聘开始啦！！！ 2020-2-23
+https://www.nowcoder.com/discuss/369212?type=7&order=3&pos=164&page=1
+
+### 【美团点评】搜索后台实习生    2020-2-23
+https://www.nowcoder.com/discuss/369264?type=7&order=3&pos=155&page=1
+
+### 【华为2012实验室中央软件院】春招&实习开始啦   2020-2-23
+https://www.nowcoder.com/discuss/369269?type=7&order=3&pos=153&page=1
+
+### 【一加】各种岗位、内推2020届  2020-2-23
+https://www.nowcoder.com/discuss/369280?type=7&order=3&pos=150&page=2 
+
+### 【浪潮集团】内推 春招补招，浪潮集团2020届校园招聘   2020-2-23
+https://www.nowcoder.com/discuss/369553?type=7&order=3&pos=95&page=1
+
+### 【图森未来科技的公司】实习招聘   2020-2-23
+https://www.nowcoder.com/discuss/369654?type=7&order=3&pos=69&page=0
+
+### 【微软苏州】O365 Tech Talk & 社招内推   2020-2-23
+https://www.nowcoder.com/discuss/369868?type=7&order=3&pos=23&page=1
+
+### 【菜鸟物流国际版】春季实习校招开始了，还没上车的小伙伴抓紧哈   2020-2-23
+https://www.nowcoder.com/discuss/369902?type=7&order=3&pos=20&page=0
+
+### 【阿里钉钉】远航者计划-2021届技术实习生招聘内部推荐开始啦！  2020-2-23
+https://www.nowcoder.com/discuss/369920?type=7&order=3&pos=15&page=1
+
+### 【网易雷火】实习生&补招春招 已经启动，只需几步即可内推   2020-2-23
+https://www.nowcoder.com/discuss/369936?type=7&order=3&pos=9&page=1
+
+### 【阿里巴巴】大数据开发实习生  2020-02-16
+https://www.nowcoder.com/discuss/366504?type=7&order=3&pos=286&page=1
+
+### 【亚信科技】2021届实习生招聘正式启动！  2020-02-16
+https://www.nowcoder.com/discuss/366630?type=7&order=3&pos=242&page=1
+ 
+### 【阿里巴巴-阿里妈妈】春招实习内推  2020-02-16
+https://www.nowcoder.com/discuss/366714?type=7&order=3&pos=199&page=0
+
+### 【神州信息】2020春季招聘             2020-02-16
+https://www.nowcoder.com/discuss/366810?type=7&order=3&pos=162&page=1
+
+### 【阿里钉钉】—2021届技术实习生招聘内推开始啦           2020-02-16
+https://www.nowcoder.com/discuss/366988?type=7&order=3&pos=103&page=1
+
+### 【思科】2020校招补录软件开发工程师，base上海          2020-02-16
+https://www.nowcoder.com/discuss/367074?type=7&order=3&pos=88&page=1
+
+### 【虎牙直播】 2021实习/2020校招/社招              2020-02-16
+https://www.nowcoder.com/discuss/367086?type=7&order=3&pos=83&page=3
+
+### 【商汤科技】-JAVA开发 2020届补招            2020-02-16
+https://www.nowcoder.com/discuss/367154?type=7&order=3&pos=60&page=1
+
+### 【亿联网络】20届春招【入贴即可内推】     2020-02-16
+https://www.nowcoder.com/discuss/366781?type=7&order=0&pos=42&page=2
+
+### 【TPLINK】2020春招内推               2020-02-16
+https://www.nowcoder.com/discuss/366976?type=7&order=0&pos=29&page=1
+ 
+### 【搜狗】2020春招正式批内推             2020-02-16
+https://www.nowcoder.com/discuss/366789?type=7&order=0&pos=26&page=1
+
+### 【松果出行】校招～～待遇好福利好，欢迎投递        2020-02-16
+https://www.nowcoder.com/discuss/367216?type=7&order=0&pos=18&page=1
+
+### 【微信公众号后台团队】-2020暑期实习生招聘【可转正】           2020-02-16
+https://www.nowcoder.com/discuss/367353?type=7&order=0&pos=15&page=1
+
+### 【帆软春招】纯线上笔面试，base南京无锡可选，可内推        2020-02-16
+https://www.nowcoder.com/discuss/366846?type=7&order=0&pos=13&page=2
+
+### 【心动网络&TapTap】 2020春季校园招聘开始啦！       2020-02-16
+https://www.nowcoder.com/discuss/367324?type=7&order=0&pos=12&page=2
+
+### 【Shopee】 (base 新加坡) 全年内推！！应届往届均可           2020-02-16
+https://www.nowcoder.com/discuss/367382?type=7&order=0&pos=10&page=0
+
+### 【依图科技】内推        2020-02-16
+https://www.nowcoder.com/discuss/364637?type=7&order=0&pos=8&page=2
+
+### 【微信】小程序技术团队招聘暑期实习   2020-02-12
+https://www.nowcoder.com/discuss/366099?type=7&order=3&pos=114&page=1
+
+### 【南京青书】春招提前批！待遇优厚，还有非技术岗~在家也能面试哦    2020-02-12
+https://www.nowcoder.com/discuss/366310?type=7&order=3&pos=37&page=1
+
+### 【科大讯飞】科大讯飞2020届校园招聘春季补录内推来啦！   2020-02-12
+https://www.nowcoder.com/discuss/366335?type=7&order=3&pos=24&page=0
+ 
+###  【奇安信】2020春招&实习内推    2020-02-12
+https://www.nowcoder.com/discuss/366347?type=7&order=3&pos=17&page=1
+
+### 【Deeproute.ai】自动驾驶企业： 内推   2020-02-07
+https://www.nowcoder.com/discuss/364902?type=7&order=3&pos=86&page=1
+
+### 【北京陌陌】实习招聘 技术岗 产品岗    2020-02-07
+https://www.nowcoder.com/discuss/364862?type=7&order=0&pos=151&page=1
+
+### 【阿里云计算有限公司】阿里云-弹性计算-研发工程师JAVA    2020-02-07
+https://www.nowcoder.com/discuss/365054?type=7&order=0&pos=134&page=1
+
+### 【阿里妈妈】淘宝联盟 21届暑期实习生 开始啦   2020-02-07
+https://www.nowcoder.com/discuss/364820?type=7&order=0&pos=107&page=1
+
+### 【阿里云】智能存储/高性能计算2020实习生招聘吧！   2020-02-07
+https://www.nowcoder.com/discuss/365019?type=7&order=0&pos=93&page=2
+
+### 【TCL】2020届校招启动，400余岗位等你来！   2020-02-07
+https://www.nowcoder.com/discuss/365063?type=7&order=0&pos=73&page=1
+
+### 【淘系】 2021届春招内推 新零售技术事业群-淘系技术部 技术&非技术   2020-02-07
+https://www.nowcoder.com/discuss/365108?type=7&order=0&pos=55&page=1
+
+### 【三七互娱】校招和社招岗位   2020-02-07
+https://www.nowcoder.com/discuss/365171?type=7&order=0&pos=34&page=1
+
+### 后端开发实习岗-Java-自动驾驶基础架构部  上海   2020-02-07
+https://www.nowcoder.com/discuss/358568?type=7&order=0&pos=17&page=1
+
+### 【微软苏州】新产品研发团队招人啦！     2020-02-07
+https://www.nowcoder.com/discuss/365059?type=7&order=0&pos=15&page=1
+
+### 【猿辅导】社招or20届 可内推各种工程师   2020-02-07
+https://www.nowcoder.com/discuss/365035?type=7&order=0&pos=13&page=1
+
+### 【京东】京东零售-技术与数据中台实习生招聘   2020-02-07
+https://www.nowcoder.com/discuss/364913?type=0&order=0&pos=14&page=1
+
+### 【博思软件】春招线上通道正常开启，前端/JAVA/ 大数据/算法/软件测试均有岗位需求哦！ 2020-02-04 
+https://www.nowcoder.com/discuss/364394?type=7&order=0&pos=91&page=1
+
+### 【阿里巴巴电商板块新兴的核心业务】2021校招启动   2020-02-04 
+https://www.nowcoder.com/discuss/364592?type=0&order=0&pos=100&page=0
+
+### 【阿里巴巴供应链平台事业部】2021届暑期实习生招聘（春招）          2020-02-04 
+https://www.nowcoder.com/discuss/364611?type=0&order=0&pos=48&page=0
+
+### 【OPPO内推】2020届应届生 软件类，硬件类，产品类，综合职能类，营销类等   2020-02-04 
+https://www.nowcoder.com/discuss/364442?type=0&order=0&pos=9&page=3
+
+### 【Grab内推-国际互联网大厂】社招&校招都有！不加班！  2020-02-04 
+https://www.nowcoder.com/discuss/337566?type=7&order=0&pos=68&page=2
+
+### 【字节跳动】 校招/社招/实习  数据平台组部门直招
+https://www.nowcoder.com/discuss/363916?type=7&order=3&pos=30&page=1
+
+### 【阿里巴巴】CBU无线提前春招java/android/iOS
+https://www.nowcoder.com/discuss/364007?type=7&order=3&pos=25&page=1
+
+### 【阿里巴巴】急招前端工程师（实习、校招、社招均可）
+https://www.nowcoder.com/discuss/364086?type=7&order=3&pos=18&page=1
+
+### 【图森未来】前后端、算法岗，内推实习
+https://www.nowcoder.com/discuss/364105?type=7&order=3&pos=14&page=1
+
+### 【Jerry Ai】 软件工程师招聘, 含实习岗（远程/多伦多）
+https://www.nowcoder.com/discuss/364158?type=7&order=3&pos=10&page=0
+
+### 【58同城】部门直推-校招+社招-算法和后端
+https://www.nowcoder.com/discuss/362453?type=0&order=0&pos=140&page=1
+
+### 【京东】招聘实习生可转正-前端/后端开发工程师
+https://www.nowcoder.com/discuss/363081?type=0&order=0&pos=104&page=1
+
+### 【网易游戏】2020年春招内推（互娱，雷火事业群）大量HC
+https://www.nowcoder.com/discuss/361032?type=0&order=0&pos=64&page=1
+
+### 【头条实习内推】自助查进度 大数据开发实习生（可转正）- 商业变现 
+https://www.nowcoder.com/discuss/363813?type=0&order=0&pos=14&page=0
+
+### 【亚马逊】软件工程师 (SDE II/SDE III)
+https://www.nowcoder.com/discuss/362803?type=0&order=0&pos=111&page=1
+
+### 【阿里】北京招20、21年毕业的实习/校招生
+https://www.nowcoder.com/discuss/362551?type=0&order=0&pos=92&page=1
+
+### 【eBay 智能营销事业部】 - 软件开发实习生
+https://www.nowcoder.com/discuss/362874?type=0&order=0&pos=65&page=1
+
+### 【shopee】春招开启
+https://www.nowcoder.com/discuss/362129?type=0&order=0&pos=32&page=1
+
+### 【BIGO】2021届实习招聘内推
+https://www.nowcoder.com/discuss/362915?type=0&order=0&pos=8&page=0
+
+### 【百度推荐策略部（百度两大核心部门之一）】社招 + 20年毕业生（校招补招）+ 21年毕业实习生
+https://www.nowcoder.com/discuss/362298?type=0&order=0&pos=38&page=0
+
+### 【百度】校招补录+春季实习内推
+https://www.nowcoder.com/discuss/362216?type=0&order=0&pos=43&page=1
+
+### 【CVTE】21届实习生内推
+https://www.nowcoder.com/discuss/362006?type=0&order=0&pos=77&page=1
+
+### 【春招】Shopee研发中心2020春季校招开启！速抢！
+https://www.nowcoder.com/discuss/362029?type=0&order=0&pos=43&page=1
+
+### 【字节跳动】
+https://www.nowcoder.com/discuss/356297?type=0&order=0&pos=50&page=1
+https://www.nowcoder.com/discuss/361626?type=0&order=0&pos=51&page=1
+https://www.nowcoder.com/discuss/361964?type=0&order=0&pos=87&page=1
+
+### 【百度】客户端/前端/后端/数据招人-实习、校招、社招
+https://www.nowcoder.com/discuss/361514?type=0&order=0&pos=109&page=1
+
+### 【猫眼内推】20届补招 上海还有岗位
+https://www.nowcoder.com/discuss/318650?type=0&order=0&pos=37&page=6
+
+### 【微博校招】【杭州】2020届校招补录研发岗
+https://www.nowcoder.com/discuss/359923?type=0&order=0&pos=23&page=1
+
+### 【雪浪数制】招聘大数据开发/java开发（应届生）
+https://www.nowcoder.com/discuss/345671?type=0&order=0&pos=16&page=1
+
+### 【苏宁】2020届春招提前批
+https://www.nowcoder.com/discuss/346355?type=0&order=0&pos=15&page=3
+
+### 【海康威视】春招来袭~~(20届校招、21届实习、社招)
+https://www.nowcoder.com/discuss/361841?type=0&order=0&pos=14&page=1
+
+### 【广州速游】春招
+https://www.nowcoder.com/discuss/361662?type=0&order=0&pos=10&page=1
+
+### 【VIVO】2020春招
+https://www.nowcoder.com/discuss/360785?type=0&order=0&pos=35&page=1
+
+### 【ThoughtWorks补录】2020届
+https://www.nowcoder.com/discuss/358055?type=0&order=0&pos=17&page=1
+
+### 【用友】2020春招
+https://www.nowcoder.com/discuss/361223?type=0&order=0&pos=12&page=2
+
+### 【寒武纪】补招
+https://www.nowcoder.com/discuss/361245?type=0&order=0&pos=51&page=1
+
+### 【深信服】2021届实习生
+https://www.nowcoder.com/discuss/360510?type=0&order=0&pos=81&page=1
+
+### 【招银网络科技】
+https://www.nowcoder.com/discuss/361236?type=7&order=0&pos=33&page=1
+
+### 【帆软】2020春招
+https://www.nowcoder.com/discuss/361185?type=7&order=0&pos=22&page=1
+
+### 【快手】20届补招
+https://www.nowcoder.com/discuss/348236?type=7&order=0&pos=56&page=5
+
+### 【百度秋招补招】智能生活事业群组-众多岗位来袭！
+https://www.nowcoder.com/discuss/360968?type=7&order=0&pos=68&page=1
+
+### 【阿里巴巴-淘宝2020春招20应届或者21届实习】面向所有研发岗招聘
+https://www.nowcoder.com/discuss/361377?type=0&order=0&pos=6&page=1
+
+### 【淘宝消息平台】春季招聘-20届春招、21届实习
+https://www.nowcoder.com/discuss/361618?type=0&order=0&pos=25&page=1
+
+### 【搜狐】【校招】部门直招ing-日常实习、实习转正、春招
+https://www.nowcoder.com/discuss/361361
+
+### 【国企】【光大科技】2020春招开始！支持线上面试！
+https://www.nowcoder.com/discuss/358457?type=0&order=0&pos=55&page=2
+
+### 【百度智能云计算部部门】实习
+https://www.nowcoder.com/discuss/356380?type=7&order=0&pos=35&page=1
+
+### 【平安科技人工智能中心】实习生招聘
+https://www.nowcoder.com/discuss/359518?type=7&order=0&pos=38&page=1
+
+### 【猿辅导】实习
+https://www.nowcoder.com/discuss/357041?type=7&order=0&pos=39&page=1
+
+### 【2021年实习岗】数美科技【数据挖掘，研发、数据分析】
+https://www.nowcoder.com/discuss/359938?type=7&order=0&pos=50&page=1
+
+### 【商汤科技】Java开发实习生（上海）
+https://www.nowcoder.com/discuss/361243?type=7&order=0&pos=11&page=1
+
+### 【虎牙2021届】实习生招聘
+https://www.nowcoder.com/discuss/360960?type=7&order=0&pos=70&page=1
+
+### 【IBM实习】开发工程师、数据工程师
+https://www.nowcoder.com/discuss/360897?type=7&order=0&pos=98&page=1
+
+### 【实习(可转正)】【组内直推】滴滴出行 大数据分析实习生
+https://www.nowcoder.com/discuss/332399?type=7&order=0&pos=102&page=1
\ No newline at end of file
diff --git a/README.md b/README.md
index ad8810e..ffa8611 100644
--- a/README.md
+++ b/README.md
@@ -1,2 +1,128 @@
-# Big-Data-Project
-Hadoop2.x、Zookeeper、Flume、Hive、Hbase、Kafka、Spark2.x、SparkStreaming、MySQL、Hue、J2EE、websoket、Echarts
+﻿## TALKDATA(恭喜获得一枚宝藏博主)
+TALKDATA，蚂蚁程序员，非科班转行大数据开发，B站UP主，专注于面试分享，已帮助500+同学进入大厂！
+
+通过以下可以找到我：
+
+ - 哔哩哔哩：[TALKDATA][1]
+ - 公众号：[TALKDATA][2]
+ - 知乎：[TALKDATA][3]
+ - QQ群：316916234
+
+### 分享历程
+![image](image/TALKDATA_share.png)
+
+### 学习路线
+
+ - 【我是如何从非科班成功转行大数据开发】-【[视频版][4]】【[文字版][5]】
+ - 【从转行开始到入职蚂蚁全过程】-【[文字版][6]】
+ - 【Java后端和大数据开发学习路线】-【[视频版][7]】【[下载][8]】
+ - 【岗位选择：Java后端vs大数据开发vs算法】-【[视频版][9]】
+
+
+### 简历设计
+
+ - 【如何写出有亮点的简历】-【[视频版][10]】【[文字版][11]】
+ - 【直播修改简历：大数据开发、Java后端、算法简历】-【[视频版][12]】
+
+### 项目包装
+
+ - 【如何封装简历上的项目】-【[视频版][13]】
+ - 【开源：大数据实时分析可视化系统项目】-【[访问链接][14]】
+ - 【Java后端项目推荐之项目亮难点设计】-【[视频版][15]】
+ - 【大数据项目推荐之为你的实时计算和数仓项目增加亮难点】-【[视频版][16]】
+
+### 大厂实习
+
+ - 【实习计划安排、如何转正、转正答辩技巧】-【[视频版][17]】【[文字版][18]】
+
+### 备战春秋招
+
+ - 【春招投递简历的最佳姿势】【[视频版][19]】【[文字版][20]】
+ - 【互联网大厂面试套路解析】【[视频版][21]】
+ - 【秋招面试经验分享】【[视频版][22]】【[文字版][23]】
+ - 【短时间内准备秋招的面试技巧】【[视频版][24]】
+ - 【模拟大厂面试-看优秀的程序媛小姐姐如何与面试官周旋】【[视频版][25]】
+ - 【全网最用心的公司汇总，总有一款offer适合你】【[视频版][26]】
+ - 【OFFER求比较-都是40w+的offer好难选！】【[视频版][27]】
+ 
+### 面试经验
+
+ - 【非科班如何成为一个大厂offer收割机】秋招斩获阿里云/字节/百度等sp级offer 【[视频版][28]】
+ - 【大厂面经专栏：不定期更新】【[文字版][29]】
+
+### 书籍讲解
+
+ - 【Java大数据开发面试书单】【[视频][30]】【[下载][31]】
+ - [《Java核心技术卷1》][32]
+ - [《Java编程思想》][33]
+ - [《深入理解Java虚拟机》][34]
+ - [《实战Java高并发程序设计》][35]
+ - [《Java并发编程的艺术》][36]
+ - [《Redis设计与实现》][37]
+ - [《Redis深度历险：核心原理与应用实践》][38]
+ - [《MySQL技术内幕》][39]
+ - [《Hadoop权威指南》][40]
+ - [《Spark大数据处理技术》][41]
+ - [《从PAXOS到Zookeeper分布式一致性原理与实践》][42]
+ - [《现代操作系统》][43]
+ - [《计算机网络：自顶向下方法》][44]
+
+### 工作心得
+
+ - 【[大厂工作真实感受、互联网的发展、未来规划][45]】
+
+### 知识分享
+ - 【[一文带你入门大数据][46]】
+ 
+### 求职交流群
+本QQ群用于求职交流、技术探讨以及TALKDATA最新面经动态分享等。
+
+![image](image/qqqun.jpg)
+
+
+  [1]: https://space.bilibili.com/326797886
+  [2]: https://mp.weixin.qq.com/s/S4-3ptU4tzRJIJaWDALEwg
+  [3]: https://www.zhihu.com/people/changeforeda
+  [4]: https://www.bilibili.com/video/BV1SJ411T7DV
+  [5]: https://mp.weixin.qq.com/s?__biz=Mzg4NzAxMjIyOQ==&mid=2247483698&idx=1&sn=7a4a698356bba0547135e7203a6f2bab&chksm=cf91afd8f8e626cedc546403ac9bdfb838bf1328742c705c03d8970978c53e2cc8580d4e904d&scene=178&cur_album_id=1417724228044210177#rd
+  [6]: https://mp.weixin.qq.com/s/WBiPD_86XkMkepIN4J9euQ
+  [7]: https://www.bilibili.com/video/BV1HV411d7t6
+  [8]: https://mp.weixin.qq.com/s?__biz=Mzg4NzAxMjIyOQ==&mid=2247483818&idx=1&sn=8d6643e2bc280d07e259f451ce998207&chksm=cf91af40f8e626560eb164bd5e46984da653efa3bf6f38e6b70fa99cc181027052f776e4784c&scene=178&cur_album_id=1417724228044210177#rd
+  [9]: https://www.bilibili.com/video/BV1sJ41127EP
+  [10]: https://www.bilibili.com/video/BV1XE411P7s8
+  [11]: https://mp.weixin.qq.com/s?__biz=Mzg4NzAxMjIyOQ==&mid=2247483717&idx=1&sn=3ed1c1fcb27fc4c55abb563f0840b103&chksm=cf91afaff8e626b9c7863c3306284b4e6a29d7718f4691db628bb354c48c8907e55e8a3fa8a0&scene=178&cur_album_id=1417724228044210177#rd
+  [12]: https://www.bilibili.com/video/BV1nJ411h718
+  [13]: https://www.bilibili.com/video/BV1XE411P7s8
+  [14]: https://github.com/TALKDATA/JavaBigData/blob/master/news-project.md
+  [15]: https://www.bilibili.com/video/BV1VT4y1r71g
+  [16]: https://www.bilibili.com/video/BV193411K77p
+  [17]: https://www.bilibili.com/video/BV1hQ4y1d7MF
+  [18]: https://mp.weixin.qq.com/s?__biz=Mzg4NzAxMjIyOQ==&mid=2247483842&idx=1&sn=e51a6cd0d13650fbebe360f4ee63437b&chksm=cf91af28f8e6263e0ccc3fd2d623f8d44fe4d82bc5b412aaee54739412eff1c932801de757e5&scene=178&cur_album_id=1417724228044210177#rd
+  [19]: https://www.bilibili.com/video/BV1y7411K74s
+  [20]: https://mp.weixin.qq.com/s?__biz=Mzg4NzAxMjIyOQ==&mid=2247483755&idx=1&sn=01b0368084267dee649d6fa26954df4d&chksm=cf91af81f8e62697a03335c94357c9a562e31f3e837a887e9395a3b2e4c415a81423da120a52&scene=178&cur_album_id=1417724228044210177#rd
+  [21]: https://www.bilibili.com/video/BV1sK4y1T745
+  [22]: https://www.bilibili.com/video/BV12T4y177hj
+  [23]: https://mp.weixin.qq.com/s?__biz=Mzg4NzAxMjIyOQ==&mid=2247483923&idx=1&sn=844f925fb21a742aa73e40063166c862&chksm=cf91acf9f8e625ef2e20bfd35437d5e5d19777b57ff37f52eebec6b85ca1c2bb2a98a30a9a98&scene=178&cur_album_id=1417724228044210177#rd
+  [24]: https://www.bilibili.com/video/BV1Rb4y1k7Mc
+  [25]: https://www.bilibili.com/video/BV1Q7411U7ex
+  [26]: https://www.bilibili.com/video/BV1d34y1x7ts
+  [27]: https://www.bilibili.com/video/BV1Dh411t7Bq
+  [28]: https://www.bilibili.com/video/BV1p34y1x7wa
+  [29]: https://mp.weixin.qq.com/mp/appmsgalbum?action=getalbum&album_id=1417724228044210177&__biz=Mzg4NzAxMjIyOQ==&uin=&key=&devicetype=Windows%2010%20x64&version=63030073&lang=zh_CN&ascene=7&fontgear=2
+  [30]: https://www.bilibili.com/video/BV1UT4y137M9
+  [31]: https://mp.weixin.qq.com/s/Frefd9h1t_J8xyUihQfdig
+  [32]: https://www.bilibili.com/video/BV1Ma411w714
+  [33]: https://www.bilibili.com/video/BV1mE411B7PT
+  [34]: https://www.bilibili.com/video/BV1yE411R7co
+  [35]: https://www.bilibili.com/video/BV1ZE411S79m
+  [36]: https://www.bilibili.com/video/BV1AC4y1h78E
+  [37]: https://www.bilibili.com/video/BV1WE411f7fo
+  [38]: https://www.bilibili.com/video/BV1aE411o7Fk
+  [39]: https://www.bilibili.com/video/BV1CJ411t7Ku
+  [40]: https://www.bilibili.com/video/BV1DE411r7Fn
+  [41]: https://www.bilibili.com/video/BV1pJ411W7P5libili.com/video/BV1DE411r7Fn
+  [42]: https://www.bilibili.com/video/BV1EJ411L7AU
+  [43]: https://www.bilibili.com/video/BV1xJ411p7db
+  [44]: https://www.bilibili.com/video/BV1t7411q78v
+  [45]: https://www.bilibili.com/video/BV1DL4y1E7oo
+  [46]: https://mp.weixin.qq.com/s?__biz=Mzg4NzAxMjIyOQ==&amp;mid=2247484509&amp;idx=1&amp;sn=427238d3f417911fa14f58d0092bc242&amp;chksm=cf91aab7f8e623a1fa36bcb87b6fe90b41d20c924f52b2be59d2c09a145c96c45e8066e60613&token=1514526987&lang=zh_CN#rd
\ No newline at end of file
diff --git a/code/DataProducer/.idea/artifacts/DataProducer_jar.xml b/code/DataProducer/.idea/artifacts/DataProducer_jar.xml
new file mode 100644
index 0000000..f034a42
--- /dev/null
+++ b/code/DataProducer/.idea/artifacts/DataProducer_jar.xml
@@ -0,0 +1,8 @@
+<component name="ArtifactManager">
+  <artifact type="jar" name="DataProducer:jar">
+    <output-path>$PROJECT_DIR$/out/artifacts/DataProducer_jar</output-path>
+    <root id="archive" name="DataProducer.jar">
+      <element id="module-output" name="DataProducer" />
+    </root>
+  </artifact>
+</component>
\ No newline at end of file
diff --git a/code/DataProducer/.idea/artifacts/DataProducer_jar2.xml b/code/DataProducer/.idea/artifacts/DataProducer_jar2.xml
new file mode 100644
index 0000000..596a7be
--- /dev/null
+++ b/code/DataProducer/.idea/artifacts/DataProducer_jar2.xml
@@ -0,0 +1,8 @@
+<component name="ArtifactManager">
+  <artifact type="jar" name="DataProducer:jar2">
+    <output-path>$PROJECT_DIR$/out/artifacts/DataProducer_jar2</output-path>
+    <root id="archive" name="weblogs.jar">
+      <element id="module-output" name="DataProducer" />
+    </root>
+  </artifact>
+</component>
\ No newline at end of file
diff --git a/code/DataProducer/.idea/artifacts/DataProducer_jar3.xml b/code/DataProducer/.idea/artifacts/DataProducer_jar3.xml
new file mode 100644
index 0000000..c23ab96
--- /dev/null
+++ b/code/DataProducer/.idea/artifacts/DataProducer_jar3.xml
@@ -0,0 +1,8 @@
+<component name="ArtifactManager">
+  <artifact type="jar" name="DataProducer:jar3">
+    <output-path>$PROJECT_DIR$/out/artifacts/DataProducer_jar3</output-path>
+    <root id="archive" name="weblogs.jar">
+      <element id="module-output" name="DataProducer" />
+    </root>
+  </artifact>
+</component>
\ No newline at end of file
diff --git a/code/DataProducer/.idea/misc.xml b/code/DataProducer/.idea/misc.xml
new file mode 100644
index 0000000..0548357
--- /dev/null
+++ b/code/DataProducer/.idea/misc.xml
@@ -0,0 +1,6 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<project version="4">
+  <component name="ProjectRootManager" version="2" languageLevel="JDK_1_8" default="true" project-jdk-name="1.8" project-jdk-type="JavaSDK">
+    <output url="file://$PROJECT_DIR$/out" />
+  </component>
+</project>
\ No newline at end of file
diff --git a/code/DataProducer/.idea/modules.xml b/code/DataProducer/.idea/modules.xml
new file mode 100644
index 0000000..1b3af89
--- /dev/null
+++ b/code/DataProducer/.idea/modules.xml
@@ -0,0 +1,8 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<project version="4">
+  <component name="ProjectModuleManager">
+    <modules>
+      <module fileurl="file://$PROJECT_DIR$/DataProducer.iml" filepath="$PROJECT_DIR$/DataProducer.iml" />
+    </modules>
+  </component>
+</project>
\ No newline at end of file
diff --git a/code/DataProducer/.idea/workspace.xml b/code/DataProducer/.idea/workspace.xml
new file mode 100644
index 0000000..daa8adc
--- /dev/null
+++ b/code/DataProducer/.idea/workspace.xml
@@ -0,0 +1,276 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<project version="4">
+  <component name="ArtifactsWorkspaceSettings">
+    <artifacts-to-build>
+      <artifact name="DataProducer:jar3" />
+    </artifacts-to-build>
+  </component>
+  <component name="ChangeListManager">
+    <list default="true" id="22305db1-4db4-47f1-9fbe-b8348b6c7f63" name="Default Changelist" comment="" />
+    <ignored path="$PROJECT_DIR$/out/" />
+    <option name="EXCLUDED_CONVERTED_TO_IGNORED" value="true" />
+    <option name="SHOW_DIALOG" value="false" />
+    <option name="HIGHLIGHT_CONFLICTS" value="true" />
+    <option name="HIGHLIGHT_NON_ACTIVE_CHANGELIST" value="false" />
+    <option name="LAST_RESOLUTION" value="IGNORE" />
+  </component>
+  <component name="FUSProjectUsageTrigger">
+    <session id="2019644710">
+      <usages-collector id="statistics.lifecycle.project">
+        <counts>
+          <entry key="project.closed" value="3" />
+          <entry key="project.open.time.19" value="1" />
+          <entry key="project.open.time.2" value="2" />
+          <entry key="project.opened" value="3" />
+        </counts>
+      </usages-collector>
+      <usages-collector id="statistics.file.extensions.open">
+        <counts>
+          <entry key="java" value="1" />
+        </counts>
+      </usages-collector>
+      <usages-collector id="statistics.file.types.open">
+        <counts>
+          <entry key="JAVA" value="1" />
+        </counts>
+      </usages-collector>
+    </session>
+  </component>
+  <component name="FileEditorManager">
+    <leaf SIDE_TABS_SIZE_LIMIT_KEY="300">
+      <file pinned="false" current-in-tab="true">
+        <entry file="file://$PROJECT_DIR$/src/ReadWrite.java">
+          <provider selected="true" editor-type-id="text-editor">
+            <state relative-caret-position="672">
+              <caret line="48" column="5" lean-forward="true" selection-start-line="48" selection-start-column="5" selection-end-line="48" selection-end-column="5" />
+            </state>
+          </provider>
+        </entry>
+      </file>
+    </leaf>
+  </component>
+  <component name="FileTemplateManagerImpl">
+    <option name="RECENT_TEMPLATES">
+      <list>
+        <option value="Class" />
+      </list>
+    </option>
+  </component>
+  <component name="IdeDocumentHistory">
+    <option name="CHANGED_PATHS">
+      <list>
+        <option value="$PROJECT_DIR$/src/ReadWrite.java" />
+      </list>
+    </option>
+  </component>
+  <component name="JsBuildToolGruntFileManager" detection-done="true" sorting="DEFINITION_ORDER" />
+  <component name="JsBuildToolPackageJson" detection-done="true" sorting="DEFINITION_ORDER" />
+  <component name="JsGulpfileManager">
+    <detection-done>true</detection-done>
+    <sorting>DEFINITION_ORDER</sorting>
+  </component>
+  <component name="ProjectFrameBounds" extendedState="6">
+    <option name="x" value="493" />
+    <option name="y" value="121" />
+    <option name="width" value="1382" />
+    <option name="height" value="744" />
+  </component>
+  <component name="ProjectView">
+    <navigator proportions="" version="1">
+      <foldersAlwaysOnTop value="true" />
+    </navigator>
+    <panes>
+      <pane id="ProjectPane">
+        <subPane>
+          <expand>
+            <path>
+              <item name="DataProducer" type="b2602c69:ProjectViewProjectNode" />
+              <item name="DataProducer" type="462c0819:PsiDirectoryNode" />
+            </path>
+          </expand>
+          <select />
+        </subPane>
+      </pane>
+      <pane id="Scope" />
+      <pane id="PackagesPane" />
+    </panes>
+  </component>
+  <component name="PropertiesComponent">
+    <property name="WebServerToolWindowFactoryState" value="false" />
+    <property name="aspect.path.notification.shown" value="true" />
+    <property name="com.android.tools.idea.instantapp.provision.ProvisionBeforeRunTaskProvider.myTimeStamp" value="1547456455067" />
+    <property name="nodejs_interpreter_path.stuck_in_default_project" value="undefined stuck path" />
+    <property name="nodejs_npm_path_reset_for_default_project" value="true" />
+    <property name="project.structure.last.edited" value="Artifacts" />
+    <property name="project.structure.proportion" value="0.15" />
+    <property name="project.structure.side.proportion" value="0.2" />
+  </component>
+  <component name="RunDashboard">
+    <option name="ruleStates">
+      <list>
+        <RuleState>
+          <option name="name" value="ConfigurationTypeDashboardGroupingRule" />
+        </RuleState>
+        <RuleState>
+          <option name="name" value="StatusDashboardGroupingRule" />
+        </RuleState>
+      </list>
+    </option>
+  </component>
+  <component name="SvnConfiguration">
+    <configuration />
+  </component>
+  <component name="TaskManager">
+    <task active="true" id="Default" summary="Default task">
+      <changelist id="22305db1-4db4-47f1-9fbe-b8348b6c7f63" name="Default Changelist" comment="" />
+      <created>1547448855555</created>
+      <option name="number" value="Default" />
+      <option name="presentableId" value="Default" />
+      <updated>1547448855555</updated>
+      <workItem from="1547448857986" duration="1958000" />
+      <workItem from="1547542933000" duration="153000" />
+      <workItem from="1547970517092" duration="605000" />
+    </task>
+    <servers />
+  </component>
+  <component name="TimeTrackingManager">
+    <option name="totallyTimeSpent" value="2716000" />
+  </component>
+  <component name="ToolWindowManager">
+    <frame x="-8" y="-8" width="1936" height="1056" extended-state="6" />
+    <editor active="true" />
+    <layout>
+      <window_info active="true" content_ui="combo" id="Project" order="0" visible="true" weight="0.2553305" />
+      <window_info id="Structure" order="1" side_tool="true" weight="0.25" />
+      <window_info id="Image Layers" order="2" />
+      <window_info id="Designer" order="3" />
+      <window_info id="UI Designer" order="4" />
+      <window_info id="Capture Tool" order="5" />
+      <window_info id="Favorites" order="6" side_tool="true" />
+      <window_info anchor="bottom" id="Message" order="0" />
+      <window_info anchor="bottom" id="Find" order="1" />
+      <window_info anchor="bottom" id="Run" order="2" />
+      <window_info anchor="bottom" id="Debug" order="3" weight="0.4" />
+      <window_info anchor="bottom" id="Cvs" order="4" weight="0.25" />
+      <window_info anchor="bottom" id="Inspection" order="5" weight="0.4" />
+      <window_info anchor="bottom" id="TODO" order="6" />
+      <window_info anchor="bottom" id="Version Control" order="7" show_stripe_button="false" />
+      <window_info anchor="bottom" id="Database Changes" order="8" show_stripe_button="false" />
+      <window_info anchor="bottom" id="Terminal" order="9" />
+      <window_info anchor="bottom" id="Event Log" order="10" side_tool="true" />
+      <window_info anchor="bottom" id="Messages" order="11" />
+      <window_info anchor="right" id="Commander" internal_type="SLIDING" order="0" type="SLIDING" weight="0.4" />
+      <window_info anchor="right" id="Ant Build" order="1" weight="0.25" />
+      <window_info anchor="right" content_ui="combo" id="Hierarchy" order="2" weight="0.25" />
+      <window_info anchor="right" id="Palette" order="3" />
+      <window_info anchor="right" id="Capture Analysis" order="4" />
+      <window_info anchor="right" id="Database" order="5" />
+      <window_info anchor="right" id="Theme Preview" order="6" />
+      <window_info anchor="right" id="Palette&#9;" order="7" />
+      <window_info anchor="right" id="Maven Projects" order="8" />
+    </layout>
+  </component>
+  <component name="TypeScriptGeneratedFilesManager">
+    <option name="version" value="1" />
+  </component>
+  <component name="VcsContentAnnotationSettings">
+    <option name="myLimit" value="2678400000" />
+  </component>
+  <component name="editorHistoryManager">
+    <entry file="file://$PROJECT_DIR$/src/ReadWrite.java">
+      <provider selected="true" editor-type-id="text-editor">
+        <state relative-caret-position="672">
+          <caret line="48" column="5" lean-forward="true" selection-start-line="48" selection-start-column="5" selection-end-line="48" selection-end-column="5" />
+        </state>
+      </provider>
+    </entry>
+  </component>
+  <component name="masterDetails">
+    <states>
+      <state key="ArtifactsStructureConfigurable.UI">
+        <settings>
+          <artifact-editor />
+          <last-edited>DataProducer:jar3</last-edited>
+          <splitter-proportions>
+            <option name="proportions">
+              <list>
+                <option value="0.2" />
+                <option value="0.5" />
+              </list>
+            </option>
+          </splitter-proportions>
+        </settings>
+      </state>
+      <state key="FacetStructureConfigurable.UI">
+        <settings>
+          <last-edited>No facets are configured</last-edited>
+          <splitter-proportions>
+            <option name="proportions">
+              <list>
+                <option value="0.2" />
+              </list>
+            </option>
+          </splitter-proportions>
+        </settings>
+      </state>
+      <state key="GlobalLibrariesConfigurable.UI">
+        <settings>
+          <splitter-proportions>
+            <option name="proportions">
+              <list>
+                <option value="0.2" />
+              </list>
+            </option>
+          </splitter-proportions>
+        </settings>
+      </state>
+      <state key="JdkListConfigurable.UI">
+        <settings>
+          <last-edited>1.8</last-edited>
+          <splitter-proportions>
+            <option name="proportions">
+              <list>
+                <option value="0.2" />
+              </list>
+            </option>
+          </splitter-proportions>
+        </settings>
+      </state>
+      <state key="ModuleStructureConfigurable.UI">
+        <settings>
+          <last-edited>DataProducer</last-edited>
+          <splitter-proportions>
+            <option name="proportions">
+              <list>
+                <option value="0.2" />
+              </list>
+            </option>
+          </splitter-proportions>
+        </settings>
+      </state>
+      <state key="ProjectJDKs.UI">
+        <settings>
+          <last-edited>1.8</last-edited>
+          <splitter-proportions>
+            <option name="proportions">
+              <list>
+                <option value="0.2" />
+              </list>
+            </option>
+          </splitter-proportions>
+        </settings>
+      </state>
+      <state key="ProjectLibrariesConfigurable.UI">
+        <settings>
+          <splitter-proportions>
+            <option name="proportions">
+              <list>
+                <option value="0.2" />
+              </list>
+            </option>
+          </splitter-proportions>
+        </settings>
+      </state>
+    </states>
+  </component>
+</project>
\ No newline at end of file
diff --git a/code/DataProducer/DataProducer.iml b/code/DataProducer/DataProducer.iml
new file mode 100644
index 0000000..c90834f
--- /dev/null
+++ b/code/DataProducer/DataProducer.iml
@@ -0,0 +1,11 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<module type="JAVA_MODULE" version="4">
+  <component name="NewModuleRootManager" inherit-compiler-output="true">
+    <exclude-output />
+    <content url="file://$MODULE_DIR$">
+      <sourceFolder url="file://$MODULE_DIR$/src" isTestSource="false" />
+    </content>
+    <orderEntry type="inheritedJdk" />
+    <orderEntry type="sourceFolder" forTests="false" />
+  </component>
+</module>
\ No newline at end of file
diff --git a/code/DataProducer/out/artifacts/DataProducer_jar/DataProducer.jar b/code/DataProducer/out/artifacts/DataProducer_jar/DataProducer.jar
new file mode 100644
index 0000000..d13662a
Binary files /dev/null and b/code/DataProducer/out/artifacts/DataProducer_jar/DataProducer.jar differ
diff --git a/code/DataProducer/out/artifacts/DataProducer_jar2/weblogs.jar b/code/DataProducer/out/artifacts/DataProducer_jar2/weblogs.jar
new file mode 100644
index 0000000..d13662a
Binary files /dev/null and b/code/DataProducer/out/artifacts/DataProducer_jar2/weblogs.jar differ
diff --git a/code/DataProducer/out/artifacts/DataProducer_jar3/weblogs.jar b/code/DataProducer/out/artifacts/DataProducer_jar3/weblogs.jar
new file mode 100644
index 0000000..b947ad6
Binary files /dev/null and b/code/DataProducer/out/artifacts/DataProducer_jar3/weblogs.jar differ
diff --git a/code/DataProducer/out/production/DataProducer/META-INF/MANIFEST.MF b/code/DataProducer/out/production/DataProducer/META-INF/MANIFEST.MF
new file mode 100644
index 0000000..da86503
--- /dev/null
+++ b/code/DataProducer/out/production/DataProducer/META-INF/MANIFEST.MF
@@ -0,0 +1,3 @@
+Manifest-Version: 1.0
+Main-Class: ReadWrite
+
diff --git a/code/DataProducer/out/production/DataProducer/ReadWrite.class b/code/DataProducer/out/production/DataProducer/ReadWrite.class
new file mode 100644
index 0000000..09a3f4e
Binary files /dev/null and b/code/DataProducer/out/production/DataProducer/ReadWrite.class differ
diff --git a/code/DataProducer/src/META-INF/MANIFEST.MF b/code/DataProducer/src/META-INF/MANIFEST.MF
new file mode 100644
index 0000000..da86503
--- /dev/null
+++ b/code/DataProducer/src/META-INF/MANIFEST.MF
@@ -0,0 +1,3 @@
+Manifest-Version: 1.0
+Main-Class: ReadWrite
+
diff --git a/code/DataProducer/src/ReadWrite.java b/code/DataProducer/src/ReadWrite.java
new file mode 100644
index 0000000..da7754a
--- /dev/null
+++ b/code/DataProducer/src/ReadWrite.java
@@ -0,0 +1,67 @@
+import java.io.*;
+public class ReadWrite {
+    static String readFileName;
+    static String writeFileName;
+    public static void main(String args[]){
+        readFileName = args[0];
+        writeFileName = args[1];
+        try {
+            // readInput();
+            readFileByLines(readFileName);
+        }catch(Exception e){
+        }
+    }
+
+    public static void readFileByLines(String fileName) {
+        FileInputStream fis = null;
+        InputStreamReader isr = null;
+        BufferedReader br = null;
+        String tempString = null;
+        try {
+            System.out.println("以行为单位读取文件内容，一次读一整行：");
+            fis = new FileInputStream(fileName);// FileInputStream
+            // 从文件系统中的某个文件中获取字节
+            isr = new InputStreamReader(fis,"GBK");
+            br = new BufferedReader(isr);
+            int count=0;
+            while ((tempString = br.readLine()) != null) {
+                count++;
+                // 显示行号
+                Thread.sleep(300);
+                String str = new String(tempString.getBytes("UTF8"),"GBK");
+                System.out.println("row:"+count+">>>>>>>>"+tempString);
+                method1(writeFileName,tempString);
+                //appendMethodA(writeFileName,tempString);
+            }
+            isr.close();
+        } catch (IOException e) {
+            e.printStackTrace();
+        } catch (InterruptedException e) {
+            e.printStackTrace();
+        } finally {
+            if (isr != null) {
+                try {
+                    isr.close();
+                } catch (IOException e1) {
+                }
+            }
+        }
+    }
+    public static void method1(String file, String conent) {
+        BufferedWriter out = null;
+        try {
+            out = new BufferedWriter(new OutputStreamWriter(
+                    new FileOutputStream(file, true)));
+            out.write("\n");
+            out.write(conent);
+        } catch (Exception e) {
+            e.printStackTrace();
+        } finally {
+            try {
+                out.close();
+            } catch (IOException e) {
+                e.printStackTrace();
+            }
+        }
+    }
+}
diff --git a/code/TestSpark/.idea/artifacts/TestSpark_jar.xml b/code/TestSpark/.idea/artifacts/TestSpark_jar.xml
new file mode 100644
index 0000000..d10e91d
--- /dev/null
+++ b/code/TestSpark/.idea/artifacts/TestSpark_jar.xml
@@ -0,0 +1,8 @@
+<component name="ArtifactManager">
+  <artifact type="jar" name="TestSpark:jar">
+    <output-path>$PROJECT_DIR$/out/artifacts/TestSpark_jar</output-path>
+    <root id="archive" name="TestSpark.jar">
+      <element id="module-output" name="TestSpark" />
+    </root>
+  </artifact>
+</component>
\ No newline at end of file
diff --git a/code/TestSpark/.idea/compiler.xml b/code/TestSpark/.idea/compiler.xml
new file mode 100644
index 0000000..ac75bec
--- /dev/null
+++ b/code/TestSpark/.idea/compiler.xml
@@ -0,0 +1,13 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<project version="4">
+  <component name="CompilerConfiguration">
+    <annotationProcessing>
+      <profile name="Maven default annotation processors profile" enabled="true">
+        <sourceOutputDir name="target/generated-sources/annotations" />
+        <sourceTestOutputDir name="target/generated-test-sources/test-annotations" />
+        <outputRelativeToContentRoot value="true" />
+        <module name="TestSpark" />
+      </profile>
+    </annotationProcessing>
+  </component>
+</project>
\ No newline at end of file
diff --git a/code/TestSpark/.idea/hydra.xml b/code/TestSpark/.idea/hydra.xml
new file mode 100644
index 0000000..c5d8b51
--- /dev/null
+++ b/code/TestSpark/.idea/hydra.xml
@@ -0,0 +1,9 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<project version="4">
+  <component name="HydraSettings">
+    <option name="hydraStorePath" value="D:\studyProcess\code\TestSpark\.hydra\idea" />
+    <option name="noOfCores" value="2" />
+    <option name="projectRoot" value="D:\studyProcess\code\TestSpark" />
+    <option name="sourcePartitioner" value="auto" />
+  </component>
+</project>
\ No newline at end of file
diff --git a/code/TestSpark/.idea/libraries/scala_sdk_2_11_12.xml b/code/TestSpark/.idea/libraries/scala_sdk_2_11_12.xml
new file mode 100644
index 0000000..1d1d2a8
--- /dev/null
+++ b/code/TestSpark/.idea/libraries/scala_sdk_2_11_12.xml
@@ -0,0 +1,25 @@
+<component name="libraryTable">
+  <library name="scala-sdk-2.11.12" type="Scala">
+    <properties>
+      <language-level>Scala_2_11</language-level>
+      <compiler-classpath>
+        <root url="file://D:/software/study/scala-2.11.12/lib/scala-compiler.jar" />
+        <root url="file://D:/software/study/scala-2.11.12/lib/scala-library.jar" />
+        <root url="file://D:/software/study/scala-2.11.12/lib/scala-reflect.jar" />
+      </compiler-classpath>
+    </properties>
+    <CLASSES>
+      <root url="jar://D:/software/study/scala-2.11.12/lib/scala-actors-2.11.0.jar!/" />
+      <root url="jar://D:/software/study/scala-2.11.12/lib/scala-actors-migration_2.11-1.1.0.jar!/" />
+      <root url="jar://D:/software/study/scala-2.11.12/lib/scala-library.jar!/" />
+      <root url="jar://D:/software/study/scala-2.11.12/lib/scala-parser-combinators_2.11-1.0.4.jar!/" />
+      <root url="jar://D:/software/study/scala-2.11.12/lib/scala-reflect.jar!/" />
+      <root url="jar://D:/software/study/scala-2.11.12/lib/scala-swing_2.11-1.0.2.jar!/" />
+      <root url="jar://D:/software/study/scala-2.11.12/lib/scala-xml_2.11-1.0.5.jar!/" />
+    </CLASSES>
+    <JAVADOC>
+      <root url="http://www.scala-lang.org/api/2.11.12/" />
+    </JAVADOC>
+    <SOURCES />
+  </library>
+</component>
\ No newline at end of file
diff --git a/code/TestSpark/.idea/misc.xml b/code/TestSpark/.idea/misc.xml
new file mode 100644
index 0000000..919019a
--- /dev/null
+++ b/code/TestSpark/.idea/misc.xml
@@ -0,0 +1,17 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<project version="4">
+  <component name="ExternalStorageConfigurationManager" enabled="true" />
+  <component name="FrameworkDetectionExcludesConfiguration">
+    <file type="web" url="file://$PROJECT_DIR$" />
+  </component>
+  <component name="MavenProjectsManager">
+    <option name="originalFiles">
+      <list>
+        <option value="$PROJECT_DIR$/pom.xml" />
+      </list>
+    </option>
+  </component>
+  <component name="ProjectRootManager" version="2" languageLevel="JDK_1_8" project-jdk-name="1.8" project-jdk-type="JavaSDK">
+    <output url="file://$PROJECT_DIR$/out" />
+  </component>
+</project>
\ No newline at end of file
diff --git a/code/TestSpark/.idea/workspace.xml b/code/TestSpark/.idea/workspace.xml
new file mode 100644
index 0000000..b0c22b4
--- /dev/null
+++ b/code/TestSpark/.idea/workspace.xml
@@ -0,0 +1,389 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<project version="4">
+  <component name="ArtifactsWorkspaceSettings">
+    <artifacts-to-build>
+      <artifact name="TestSpark:jar" />
+    </artifacts-to-build>
+  </component>
+  <component name="ChangeListManager">
+    <list default="true" id="2116a188-52e0-4217-9cd1-18304e954ee5" name="Default Changelist" comment="" />
+    <ignored path="$PROJECT_DIR$/out/" />
+    <ignored path="$PROJECT_DIR$/target/" />
+    <option name="EXCLUDED_CONVERTED_TO_IGNORED" value="true" />
+    <option name="SHOW_DIALOG" value="false" />
+    <option name="HIGHLIGHT_CONFLICTS" value="true" />
+    <option name="HIGHLIGHT_NON_ACTIVE_CHANGELIST" value="false" />
+    <option name="LAST_RESOLUTION" value="IGNORE" />
+  </component>
+  <component name="FUSProjectUsageTrigger">
+    <session id="1827498566">
+      <usages-collector id="statistics.lifecycle.project">
+        <counts>
+          <entry key="project.closed" value="4" />
+          <entry key="project.open.time.17" value="1" />
+          <entry key="project.open.time.29" value="1" />
+          <entry key="project.open.time.34" value="1" />
+          <entry key="project.open.time.38" value="1" />
+          <entry key="project.open.time.7" value="1" />
+          <entry key="project.opened" value="5" />
+        </counts>
+      </usages-collector>
+      <usages-collector id="statistics.file.extensions.open">
+        <counts>
+          <entry key="scala" value="3" />
+        </counts>
+      </usages-collector>
+      <usages-collector id="statistics.file.types.open">
+        <counts>
+          <entry key="Scala" value="3" />
+        </counts>
+      </usages-collector>
+      <usages-collector id="statistics.file.extensions.edit">
+        <counts>
+          <entry key="scala" value="350" />
+          <entry key="xml" value="53" />
+        </counts>
+      </usages-collector>
+      <usages-collector id="statistics.file.types.edit">
+        <counts>
+          <entry key="Scala" value="350" />
+          <entry key="XML" value="53" />
+        </counts>
+      </usages-collector>
+    </session>
+  </component>
+  <component name="FileEditorManager">
+    <leaf SIDE_TABS_SIZE_LIMIT_KEY="300">
+      <file pinned="false" current-in-tab="false">
+        <entry file="file://$PROJECT_DIR$/pom.xml">
+          <provider selected="true" editor-type-id="text-editor">
+            <state relative-caret-position="260">
+              <caret line="13" column="14" lean-forward="true" selection-start-line="13" selection-start-column="14" selection-end-line="13" selection-end-column="14" />
+            </state>
+          </provider>
+        </entry>
+      </file>
+      <file pinned="false" current-in-tab="true">
+        <entry file="file://$PROJECT_DIR$/src/main/scala/test.scala">
+          <provider selected="true" editor-type-id="text-editor">
+            <state>
+              <caret selection-end-line="19" selection-end-column="1" />
+            </state>
+          </provider>
+        </entry>
+      </file>
+      <file pinned="false" current-in-tab="false">
+        <entry file="file://$PROJECT_DIR$/src/main/scala/TestStreaming.scala">
+          <provider selected="true" editor-type-id="text-editor">
+            <state relative-caret-position="240">
+              <caret line="12" column="61" lean-forward="true" selection-start-line="12" selection-start-column="3" selection-end-line="12" selection-end-column="60" />
+              <folding>
+                <element signature="e#0#33#0" expanded="true" />
+              </folding>
+            </state>
+          </provider>
+        </entry>
+      </file>
+    </leaf>
+  </component>
+  <component name="IdeDocumentHistory">
+    <option name="CHANGED_PATHS">
+      <list>
+        <option value="$PROJECT_DIR$/src/main/scala/TestStreaming.scala" />
+        <option value="$PROJECT_DIR$/pom.xml" />
+        <option value="$PROJECT_DIR$/src/main/scala/test.scala" />
+      </list>
+    </option>
+  </component>
+  <component name="JsBuildToolGruntFileManager" detection-done="true" sorting="DEFINITION_ORDER" />
+  <component name="JsBuildToolPackageJson" detection-done="true" sorting="DEFINITION_ORDER" />
+  <component name="JsGulpfileManager">
+    <detection-done>true</detection-done>
+    <sorting>DEFINITION_ORDER</sorting>
+  </component>
+  <component name="MavenImportPreferences">
+    <option name="generalSettings">
+      <MavenGeneralSettings>
+        <option name="localRepository" value="D:\software\study\apache-maven-3.6.0\respository" />
+        <option name="mavenHome" value="$APPLICATION_HOME_DIR$/plugins/maven/lib/maven3" />
+        <option name="userSettingsFile" value="D:\software\study\apache-maven-3.6.0\conf\settings.xml" />
+      </MavenGeneralSettings>
+    </option>
+    <option name="importingSettings">
+      <MavenImportingSettings>
+        <option name="importAutomatically" value="true" />
+      </MavenImportingSettings>
+    </option>
+  </component>
+  <component name="ProjectFrameBounds" extendedState="7">
+    <option name="x" value="448" />
+    <option name="y" value="116" />
+    <option name="width" value="1382" />
+    <option name="height" value="744" />
+  </component>
+  <component name="ProjectView">
+    <navigator proportions="" version="1">
+      <foldersAlwaysOnTop value="true" />
+    </navigator>
+    <panes>
+      <pane id="PackagesPane" />
+      <pane id="Scope" />
+      <pane id="ProjectPane">
+        <subPane>
+          <expand>
+            <path>
+              <item name="TestSpark" type="b2602c69:ProjectViewProjectNode" />
+              <item name="TestSpark" type="462c0819:PsiDirectoryNode" />
+            </path>
+            <path>
+              <item name="TestSpark" type="b2602c69:ProjectViewProjectNode" />
+              <item name="TestSpark" type="462c0819:PsiDirectoryNode" />
+              <item name="src" type="462c0819:PsiDirectoryNode" />
+            </path>
+            <path>
+              <item name="TestSpark" type="b2602c69:ProjectViewProjectNode" />
+              <item name="TestSpark" type="462c0819:PsiDirectoryNode" />
+              <item name="src" type="462c0819:PsiDirectoryNode" />
+              <item name="main" type="462c0819:PsiDirectoryNode" />
+            </path>
+          </expand>
+          <select />
+        </subPane>
+      </pane>
+    </panes>
+  </component>
+  <component name="PropertiesComponent">
+    <property name="WebServerToolWindowFactoryState" value="false" />
+    <property name="aspect.path.notification.shown" value="true" />
+    <property name="com.android.tools.idea.instantapp.provision.ProvisionBeforeRunTaskProvider.myTimeStamp" value="1548832194941" />
+    <property name="last_opened_file_path" value="D:/software/study/scala-2.11.12" />
+    <property name="nodejs_interpreter_path.stuck_in_default_project" value="undefined stuck path" />
+    <property name="nodejs_npm_path_reset_for_default_project" value="true" />
+    <property name="project.structure.last.edited" value="Modules" />
+    <property name="project.structure.proportion" value="0.15" />
+    <property name="project.structure.side.proportion" value="0.2" />
+    <property name="settings.editor.selected.configurable" value="preferences.pluginManager" />
+  </component>
+  <component name="RecentsManager">
+    <key name="CopyClassDialog.RECENTS_KEY">
+      <recent name="" />
+    </key>
+  </component>
+  <component name="RunDashboard">
+    <option name="ruleStates">
+      <list>
+        <RuleState>
+          <option name="name" value="ConfigurationTypeDashboardGroupingRule" />
+        </RuleState>
+        <RuleState>
+          <option name="name" value="StatusDashboardGroupingRule" />
+        </RuleState>
+      </list>
+    </option>
+  </component>
+  <component name="RunManager" selected="Application.test">
+    <configuration name="TestStreaming" type="Application" factoryName="Application" temporary="true">
+      <option name="MAIN_CLASS_NAME" value="TestStreaming" />
+      <module name="TestSpark" />
+      <method v="2">
+        <option name="Make" enabled="true" />
+      </method>
+    </configuration>
+    <configuration name="test" type="Application" factoryName="Application" temporary="true">
+      <option name="MAIN_CLASS_NAME" value="test" />
+      <module name="TestSpark" />
+      <method v="2">
+        <option name="Make" enabled="true" />
+      </method>
+    </configuration>
+    <list>
+      <item itemvalue="Application.test" />
+      <item itemvalue="Application.TestStreaming" />
+    </list>
+    <recent_temporary>
+      <list>
+        <item itemvalue="Application.test" />
+        <item itemvalue="Application.TestStreaming" />
+      </list>
+    </recent_temporary>
+  </component>
+  <component name="SvnConfiguration">
+    <configuration />
+  </component>
+  <component name="TaskManager">
+    <task active="true" id="Default" summary="Default task">
+      <changelist id="2116a188-52e0-4217-9cd1-18304e954ee5" name="Default Changelist" comment="" />
+      <created>1548571318643</created>
+      <option name="number" value="Default" />
+      <option name="presentableId" value="Default" />
+      <updated>1548571318643</updated>
+      <workItem from="1548571323847" duration="569000" />
+      <workItem from="1548571942628" duration="8199000" />
+      <workItem from="1548685352833" duration="752000" />
+      <workItem from="1548768311980" duration="1449000" />
+      <workItem from="1548826307221" duration="1915000" />
+    </task>
+    <servers />
+  </component>
+  <component name="TimeTrackingManager">
+    <option name="totallyTimeSpent" value="12884000" />
+  </component>
+  <component name="ToolWindowManager">
+    <frame x="-8" y="-8" width="1936" height="1056" extended-state="7" />
+    <editor active="true" />
+    <layout>
+      <window_info content_ui="combo" id="Project" order="0" visible="true" weight="0.26066098" />
+      <window_info id="Structure" order="1" side_tool="true" weight="0.25" />
+      <window_info id="Designer" order="2" />
+      <window_info id="Image Layers" order="3" />
+      <window_info id="UI Designer" order="4" />
+      <window_info id="Favorites" order="5" side_tool="true" />
+      <window_info id="Capture Tool" order="6" />
+      <window_info id="Web" order="7" side_tool="true" />
+      <window_info anchor="bottom" id="Message" order="0" />
+      <window_info anchor="bottom" id="Find" order="1" />
+      <window_info active="true" anchor="bottom" id="Run" order="2" visible="true" weight="0.4099783" />
+      <window_info anchor="bottom" id="Debug" order="3" weight="0.4" />
+      <window_info anchor="bottom" id="Cvs" order="4" weight="0.25" />
+      <window_info anchor="bottom" id="Inspection" order="5" weight="0.4" />
+      <window_info anchor="bottom" id="TODO" order="6" />
+      <window_info anchor="bottom" id="Terminal" order="7" />
+      <window_info anchor="bottom" id="Event Log" order="8" side_tool="true" />
+      <window_info anchor="bottom" id="Messages" order="9" weight="0.2646421" />
+      <window_info anchor="bottom" id="Java Enterprise" order="10" />
+      <window_info anchor="bottom" id="Database Changes" order="11" show_stripe_button="false" />
+      <window_info anchor="bottom" id="Version Control" order="12" show_stripe_button="false" />
+      <window_info anchor="right" id="Commander" internal_type="SLIDING" order="0" type="SLIDING" weight="0.4" />
+      <window_info anchor="right" id="Ant Build" order="1" weight="0.25" />
+      <window_info anchor="right" content_ui="combo" id="Hierarchy" order="2" weight="0.25" />
+      <window_info anchor="right" id="Palette" order="3" />
+      <window_info anchor="right" id="Capture Analysis" order="4" />
+      <window_info anchor="right" id="Maven Projects" order="5" />
+      <window_info anchor="right" id="Database" order="6" />
+      <window_info anchor="right" id="Palette&#9;" order="7" />
+      <window_info anchor="right" id="Theme Preview" order="8" />
+      <window_info anchor="right" id="Bean Validation" order="9" />
+    </layout>
+  </component>
+  <component name="TypeScriptGeneratedFilesManager">
+    <option name="version" value="1" />
+  </component>
+  <component name="VcsContentAnnotationSettings">
+    <option name="myLimit" value="2678400000" />
+  </component>
+  <component name="editorHistoryManager">
+    <entry file="file://$PROJECT_DIR$/src/main/scala/TestStreaming.scala">
+      <provider selected="true" editor-type-id="text-editor">
+        <state relative-caret-position="240">
+          <caret line="12" column="61" lean-forward="true" selection-start-line="12" selection-start-column="3" selection-end-line="12" selection-end-column="60" />
+          <folding>
+            <element signature="e#0#33#0" expanded="true" />
+          </folding>
+        </state>
+      </provider>
+    </entry>
+    <entry file="file://$PROJECT_DIR$/pom.xml">
+      <provider selected="true" editor-type-id="text-editor">
+        <state relative-caret-position="260">
+          <caret line="13" column="14" lean-forward="true" selection-start-line="13" selection-start-column="14" selection-end-line="13" selection-end-column="14" />
+        </state>
+      </provider>
+    </entry>
+    <entry file="file://$PROJECT_DIR$/src/main/scala/test.scala">
+      <provider selected="true" editor-type-id="text-editor">
+        <state>
+          <caret selection-end-line="19" selection-end-column="1" />
+        </state>
+      </provider>
+    </entry>
+  </component>
+  <component name="masterDetails">
+    <states>
+      <state key="ArtifactsStructureConfigurable.UI">
+        <settings>
+          <artifact-editor />
+          <last-edited>TestSpark:jar</last-edited>
+          <splitter-proportions>
+            <option name="proportions">
+              <list>
+                <option value="0.2" />
+                <option value="0.5" />
+              </list>
+            </option>
+          </splitter-proportions>
+        </settings>
+      </state>
+      <state key="FacetStructureConfigurable.UI">
+        <settings>
+          <last-edited>Web (TestSpark)|Web</last-edited>
+          <splitter-proportions>
+            <option name="proportions">
+              <list>
+                <option value="0.2" />
+              </list>
+            </option>
+          </splitter-proportions>
+        </settings>
+      </state>
+      <state key="GlobalLibrariesConfigurable.UI">
+        <settings>
+          <splitter-proportions>
+            <option name="proportions">
+              <list>
+                <option value="0.2" />
+              </list>
+            </option>
+          </splitter-proportions>
+        </settings>
+      </state>
+      <state key="JdkListConfigurable.UI">
+        <settings>
+          <last-edited>1.8</last-edited>
+          <splitter-proportions>
+            <option name="proportions">
+              <list>
+                <option value="0.2" />
+              </list>
+            </option>
+          </splitter-proportions>
+        </settings>
+      </state>
+      <state key="ModuleStructureConfigurable.UI">
+        <settings>
+          <last-edited>TestSpark</last-edited>
+          <splitter-proportions>
+            <option name="proportions">
+              <list>
+                <option value="0.2" />
+              </list>
+            </option>
+          </splitter-proportions>
+        </settings>
+      </state>
+      <state key="ProjectJDKs.UI">
+        <settings>
+          <last-edited>1.8</last-edited>
+          <splitter-proportions>
+            <option name="proportions">
+              <list>
+                <option value="0.2" />
+              </list>
+            </option>
+          </splitter-proportions>
+        </settings>
+      </state>
+      <state key="ProjectLibrariesConfigurable.UI">
+        <settings>
+          <last-edited>scala-sdk-2.11.12</last-edited>
+          <splitter-proportions>
+            <option name="proportions">
+              <list>
+                <option value="0.2" />
+              </list>
+            </option>
+          </splitter-proportions>
+        </settings>
+      </state>
+    </states>
+  </component>
+</project>
\ No newline at end of file
diff --git a/code/TestSpark/TestSpark.iml b/code/TestSpark/TestSpark.iml
new file mode 100644
index 0000000..78b2cc5
--- /dev/null
+++ b/code/TestSpark/TestSpark.iml
@@ -0,0 +1,2 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<module type="JAVA_MODULE" version="4" />
\ No newline at end of file
diff --git a/code/TestSpark/out/artifacts/TestSpark_jar/TestSpark.jar b/code/TestSpark/out/artifacts/TestSpark_jar/TestSpark.jar
new file mode 100644
index 0000000..eb7e126
Binary files /dev/null and b/code/TestSpark/out/artifacts/TestSpark_jar/TestSpark.jar differ
diff --git a/code/TestSpark/pom.xml b/code/TestSpark/pom.xml
new file mode 100644
index 0000000..a4df494
--- /dev/null
+++ b/code/TestSpark/pom.xml
@@ -0,0 +1,53 @@
+<?xml version="1.0" encoding="UTF-8"?>
+
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
+
+  <modelVersion>4.0.0</modelVersion>
+  <packaging>war</packaging>
+
+  <name>TestSpark</name>
+  <groupId>com.kfk.spark</groupId>
+  <artifactId>TestSpark</artifactId>
+  <version>1.0-SNAPSHOT</version>
+
+
+  <properties>
+    <scala.version>2.11.12</scala.version>
+    <scala.binary.version>2.11</scala.binary.version>
+    <spark.version>2.2.0</spark.version>
+  </properties>
+
+  <dependencies>
+    <dependency>
+      <groupId>org.apache.spark</groupId>
+      <artifactId>spark-core_${scala.binary.version}</artifactId>
+      <version>${spark.version}</version>
+    </dependency>
+    <dependency>
+      <groupId>org.apache.spark</groupId>
+      <artifactId>spark-streaming_${scala.binary.version}</artifactId>
+      <version>${spark.version}</version>
+    </dependency>
+    <dependency>
+      <groupId>org.apache.spark</groupId>
+      <artifactId>spark-sql_${scala.binary.version}</artifactId>
+      <version>${spark.version}</version>
+    </dependency>
+    <dependency>
+      <groupId>org.apache.spark</groupId>
+      <artifactId>spark-hive_${scala.binary.version}</artifactId>
+      <version>${spark.version}</version>
+    </dependency>
+    <dependency>
+      <groupId>org.apache.spark</groupId>
+      <artifactId>spark-streaming-kafka-0-10_${scala.binary.version}</artifactId>
+      <version>${spark.version}</version>
+    </dependency>
+    <dependency>
+      <groupId>org.apache.hadoop</groupId>
+      <artifactId>hadoop-client</artifactId>
+      <version>2.6.0</version>
+    </dependency>
+  </dependencies>
+
+</project>
diff --git a/code/TestSpark/src/main/resources/META-INF/MANIFEST.MF b/code/TestSpark/src/main/resources/META-INF/MANIFEST.MF
new file mode 100644
index 0000000..c4cd13b
--- /dev/null
+++ b/code/TestSpark/src/main/resources/META-INF/MANIFEST.MF
@@ -0,0 +1,3 @@
+Manifest-Version: 1.0
+Main-Class: test
+
diff --git a/code/TestSpark/src/main/scala/TestStreaming.scala b/code/TestSpark/src/main/scala/TestStreaming.scala
new file mode 100644
index 0000000..75aa35a
--- /dev/null
+++ b/code/TestSpark/src/main/scala/TestStreaming.scala
@@ -0,0 +1,23 @@
+import org.apache.spark.SparkConf
+import org.apache.spark.streaming.{Seconds, StreamingContext}
+
+
+object TestStreaming {
+
+  def main(args: Array[String]): Unit = {
+
+    val conf = new SparkConf().setMaster("local[2]").setAppName("NetworkWordCount")
+    val ssc = new StreamingContext(conf, Seconds(5))
+
+
+    val lines = ssc.socketTextStream("bigdata-pro01.kfk.com",9999)
+    val words = lines.flatMap(_.split(" "))
+    //map reduce 计算
+    val wordCounts = words.map(x => (x, 1)).reduceByKey(_ + _)
+    wordCounts.print()
+    ssc.start()
+    ssc.awaitTermination()
+
+  }
+
+}
diff --git a/code/TestSpark/src/main/scala/test.scala b/code/TestSpark/src/main/scala/test.scala
new file mode 100644
index 0000000..04cc42d
--- /dev/null
+++ b/code/TestSpark/src/main/scala/test.scala
@@ -0,0 +1,20 @@
+import org.apache.spark.sql.SparkSession
+
+object test {
+  def main(args: Array[String]): Unit = {
+
+     val spark = SparkSession
+      .builder
+       .master("yarn-cluster")
+     //  .master("local[2]")
+      .appName("HdfsTest")
+      .getOrCreate()
+
+    val path = args(0)
+    val out = args(1)
+
+    val rdd = spark.sparkContext.textFile(path)
+    val lines = rdd.flatMap(_.split(" ")).map(x=>(x,1)).reduceByKey((a,b)=>(a+b)).saveAsTextFile(out)
+  }
+
+}
diff --git a/code/TestSpark/src/main/webapp/WEB-INF/applicationContext.xml b/code/TestSpark/src/main/webapp/WEB-INF/applicationContext.xml
new file mode 100644
index 0000000..9410604
--- /dev/null
+++ b/code/TestSpark/src/main/webapp/WEB-INF/applicationContext.xml
@@ -0,0 +1,43 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+<!-- @version $Id: applicationContext.xml 561608 2007-08-01 00:33:12Z vgritsenko $ -->
+<beans xmlns="http://www.springframework.org/schema/beans"
+       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+       xmlns:util="http://www.springframework.org/schema/util"
+       xmlns:configurator="http://cocoon.apache.org/schema/configurator"
+       xmlns:avalon="http://cocoon.apache.org/schema/avalon"
+       xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-2.0.xsd
+                           http://www.springframework.org/schema/util http://www.springframework.org/schema/util/spring-util-2.0.xsd
+                           http://cocoon.apache.org/schema/configurator http://cocoon.apache.org/schema/configurator/cocoon-configurator-1.0.1.xsd
+                           http://cocoon.apache.org/schema/avalon http://cocoon.apache.org/schema/avalon/cocoon-avalon-1.0.xsd">
+
+  <!-- Activate Cocoon Spring Configurator -->
+  <configurator:settings/>
+
+  <!-- Configure Log4j -->
+  <bean name="org.apache.cocoon.spring.configurator.log4j"
+        class="org.apache.cocoon.spring.configurator.log4j.Log4JConfigurator"
+        scope="singleton">
+    <property name="settings" ref="org.apache.cocoon.configuration.Settings"/>
+    <property name="resource" value="/WEB-INF/log4j.xml"/>
+  </bean>
+
+  <!-- Activate Avalon Bridge -->
+  <avalon:bridge/>
+
+</beans>
diff --git a/code/TestSpark/src/main/webapp/WEB-INF/log4j.xml b/code/TestSpark/src/main/webapp/WEB-INF/log4j.xml
new file mode 100644
index 0000000..edb3767
--- /dev/null
+++ b/code/TestSpark/src/main/webapp/WEB-INF/log4j.xml
@@ -0,0 +1,38 @@
+<?xml version="1.0" encoding="UTF-8" ?>
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+<!DOCTYPE log4j:configuration SYSTEM "log4j.dtd">
+
+<log4j:configuration xmlns:log4j="http://jakarta.apache.org/log4j/">
+  <!--
+    - This is a sample configuration for log4j.
+    - It simply just logs everything into a single log file.
+    - Note, that you can use properties for value substitution.
+    -->
+  <appender name="CORE" class="org.apache.log4j.FileAppender">
+    <param name="File"   value="${org.apache.cocoon.work.directory}/cocoon-logs/log4j.log" />
+    <param name="Append" value="false" />
+    <layout class="org.apache.log4j.PatternLayout">
+      <param name="ConversionPattern" value="%d %-5p %t %c - %m%n"/>
+    </layout>
+  </appender>
+
+  <root>
+    <priority value="${org.apache.cocoon.log4j.loglevel}"/>
+    <appender-ref ref="CORE"/>
+  </root>
+</log4j:configuration>
diff --git a/code/TestSpark/src/main/webapp/WEB-INF/web.xml b/code/TestSpark/src/main/webapp/WEB-INF/web.xml
new file mode 100644
index 0000000..208b385
--- /dev/null
+++ b/code/TestSpark/src/main/webapp/WEB-INF/web.xml
@@ -0,0 +1,119 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+
+<!--
+  - This is the Cocoon web-app configurations file
+  -
+  - $Id$
+  -->
+<web-app version="2.4"
+         xmlns="http://java.sun.com/xml/ns/j2ee"
+         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+         xsi:schemaLocation="http://java.sun.com/xml/ns/j2ee http://java.sun.com/xml/ns/j2ee/web-app_2_4.xsd">
+
+  <!-- Servlet Filters ================================================ -->
+
+  <!--
+    - Declare a filter for multipart MIME handling
+    -->
+  <filter>
+    <description>Multipart MIME handling filter for Cocoon</description>
+    <display-name>Cocoon multipart filter</display-name>
+    <filter-name>CocoonMultipartFilter</filter-name>
+    <filter-class>org.apache.cocoon.servlet.multipart.MultipartFilter</filter-class>
+  </filter>
+
+  <!--
+    - Declare a filter for debugging incoming request
+    -->
+  <filter>
+    <description>Log debug information about each request</description>
+    <display-name>Cocoon debug filter</display-name>
+    <filter-name>CocoonDebugFilter</filter-name>
+    <filter-class>org.apache.cocoon.servlet.DebugFilter</filter-class>
+  </filter>
+
+  <!-- Filter mappings ================================================ -->
+
+  <!--
+    - Use the Cocoon multipart filter together with the Cocoon demo webapp
+    -->
+  <filter-mapping>
+    <filter-name>CocoonMultipartFilter</filter-name>
+    <servlet-name>Cocoon</servlet-name>
+  </filter-mapping>
+  <filter-mapping>
+    <filter-name>CocoonMultipartFilter</filter-name>
+    <servlet-name>DispatcherServlet</servlet-name>
+  </filter-mapping>
+
+  <!--
+    - Use the Cocoon debug filter together with the Cocoon demo webapp
+  <filter-mapping>
+    <filter-name>CocoonDebugFilter</filter-name>
+    <servlet-name>Cocoon</servlet-name>
+  </filter-mapping>
+    -->
+
+  <!-- Servlet Context Listener ======================================= -->
+
+  <!--
+    - Declare Spring context listener which sets up the Spring Application Context
+    - containing all Cocoon components (and user defined beans as well).
+    -->
+  <listener>
+    <listener-class>org.springframework.web.context.ContextLoaderListener</listener-class>
+  </listener>
+
+  <!--
+    - Declare Spring request listener which sets up the required RequestAttributes
+    - to support Springs and Cocoon custom bean scopes like the request scope or the
+    - session scope.
+    -->
+  <listener>
+    <listener-class>org.springframework.web.context.request.RequestContextListener</listener-class>
+  </listener>
+
+  <!-- Servlet Configuration ========================================== -->
+
+  <!--
+    - Servlet that dispatches requests to the Spring managed block servlets
+    -->
+  <servlet>
+    <description>Cocoon blocks dispatcher</description>
+    <display-name>DispatcherServlet</display-name>
+    <servlet-name>DispatcherServlet</servlet-name>
+    <servlet-class>org.apache.cocoon.servletservice.DispatcherServlet</servlet-class>
+    <load-on-startup>1</load-on-startup>
+  </servlet>
+
+  <!-- URL space mappings ============================================= -->
+
+  <!--
+    - Cocoon handles all the URL space assigned to the webapp using its sitemap.
+    - It is recommended to leave it unchanged. Under some circumstances though
+    - (like integration with proprietary webapps or servlets) you might have
+    - to change this parameter.
+    -->
+  <servlet-mapping>
+    <servlet-name>DispatcherServlet</servlet-name>
+    <url-pattern>/*</url-pattern>
+  </servlet-mapping>
+
+</web-app>
+        
\ No newline at end of file
diff --git a/code/TestSpark/target/classes/META-INF/MANIFEST.MF b/code/TestSpark/target/classes/META-INF/MANIFEST.MF
new file mode 100644
index 0000000..c4cd13b
--- /dev/null
+++ b/code/TestSpark/target/classes/META-INF/MANIFEST.MF
@@ -0,0 +1,3 @@
+Manifest-Version: 1.0
+Main-Class: test
+
diff --git a/code/TestSpark/target/classes/TestStreaming$$anonfun$1.class b/code/TestSpark/target/classes/TestStreaming$$anonfun$1.class
new file mode 100644
index 0000000..5b52d7a
Binary files /dev/null and b/code/TestSpark/target/classes/TestStreaming$$anonfun$1.class differ
diff --git a/code/TestSpark/target/classes/TestStreaming$$anonfun$2.class b/code/TestSpark/target/classes/TestStreaming$$anonfun$2.class
new file mode 100644
index 0000000..46565dc
Binary files /dev/null and b/code/TestSpark/target/classes/TestStreaming$$anonfun$2.class differ
diff --git a/code/TestSpark/target/classes/TestStreaming$$anonfun$3.class b/code/TestSpark/target/classes/TestStreaming$$anonfun$3.class
new file mode 100644
index 0000000..4dc03d9
Binary files /dev/null and b/code/TestSpark/target/classes/TestStreaming$$anonfun$3.class differ
diff --git a/code/TestSpark/target/classes/TestStreaming$.class b/code/TestSpark/target/classes/TestStreaming$.class
new file mode 100644
index 0000000..e9bb3ba
Binary files /dev/null and b/code/TestSpark/target/classes/TestStreaming$.class differ
diff --git a/code/TestSpark/target/classes/TestStreaming.class b/code/TestSpark/target/classes/TestStreaming.class
new file mode 100644
index 0000000..5fee3cc
Binary files /dev/null and b/code/TestSpark/target/classes/TestStreaming.class differ
diff --git a/code/TestSpark/target/classes/test$$anonfun$1.class b/code/TestSpark/target/classes/test$$anonfun$1.class
new file mode 100644
index 0000000..05fd425
Binary files /dev/null and b/code/TestSpark/target/classes/test$$anonfun$1.class differ
diff --git a/code/TestSpark/target/classes/test$$anonfun$2.class b/code/TestSpark/target/classes/test$$anonfun$2.class
new file mode 100644
index 0000000..ca0bfac
Binary files /dev/null and b/code/TestSpark/target/classes/test$$anonfun$2.class differ
diff --git a/code/TestSpark/target/classes/test$$anonfun$3.class b/code/TestSpark/target/classes/test$$anonfun$3.class
new file mode 100644
index 0000000..8f75436
Binary files /dev/null and b/code/TestSpark/target/classes/test$$anonfun$3.class differ
diff --git a/code/TestSpark/target/classes/test$.class b/code/TestSpark/target/classes/test$.class
new file mode 100644
index 0000000..f9498db
Binary files /dev/null and b/code/TestSpark/target/classes/test$.class differ
diff --git a/code/TestSpark/target/classes/test.class b/code/TestSpark/target/classes/test.class
new file mode 100644
index 0000000..4d7c83c
Binary files /dev/null and b/code/TestSpark/target/classes/test.class differ
diff --git a/code/flume-ng-sinks/flume-dataset-sink/pom.xml b/code/flume-ng-sinks/flume-dataset-sink/pom.xml
new file mode 100644
index 0000000..1e8a07b
--- /dev/null
+++ b/code/flume-ng-sinks/flume-dataset-sink/pom.xml
@@ -0,0 +1,145 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to You under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
+
+  <modelVersion>4.0.0</modelVersion>
+
+  <parent>
+    <artifactId>flume-ng-sinks</artifactId>
+    <groupId>org.apache.flume</groupId>
+    <version>1.7.0</version>
+  </parent>
+
+  <groupId>org.apache.flume.flume-ng-sinks</groupId>
+  <artifactId>flume-dataset-sink</artifactId>
+  <name>Flume NG Kite Dataset Sink</name>
+
+  <build>
+    <plugins>
+      <plugin>
+        <groupId>org.apache.rat</groupId>
+        <artifactId>apache-rat-plugin</artifactId>
+      </plugin>
+      <plugin>
+        <groupId>org.apache.felix</groupId>
+        <artifactId>maven-bundle-plugin</artifactId>
+        <version>2.3.7</version>
+        <inherited>true</inherited>
+        <extensions>true</extensions>
+      </plugin>
+    </plugins>
+  </build>
+
+  <dependencies>
+
+    <dependency>
+      <groupId>org.apache.flume</groupId>
+      <artifactId>flume-ng-sdk</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.flume</groupId>
+      <artifactId>flume-ng-configuration</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.flume</groupId>
+      <artifactId>flume-ng-core</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>org.kitesdk</groupId>
+      <artifactId>kite-data-core</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>org.kitesdk</groupId>
+      <artifactId>kite-data-hive</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>org.kitesdk</groupId>
+      <artifactId>kite-data-hbase</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.avro</groupId>
+      <artifactId>avro</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.hive</groupId>
+      <artifactId>hive-exec</artifactId>
+      <optional>true</optional>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.hive</groupId>
+      <artifactId>hive-metastore</artifactId>
+      <optional>true</optional>
+    </dependency>
+
+    <dependency>
+      <!-- build will fail if this is not hadoop-common 2.*
+      because kite uses hflush.
+      -->
+      <groupId>org.apache.hadoop</groupId>
+      <artifactId>hadoop-common</artifactId>
+      <version>${hadoop2.version}</version>
+      <optional>true</optional>
+    </dependency>
+
+    <dependency>
+      <groupId>org.slf4j</groupId>
+      <artifactId>slf4j-api</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>com.google.guava</groupId>
+      <artifactId>guava</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>junit</groupId>
+      <artifactId>junit</artifactId>
+      <scope>test</scope>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.hadoop</groupId>
+      <artifactId>hadoop-minicluster</artifactId>
+      <version>${hadoop2.version}</version>
+      <scope>test</scope>
+    </dependency>
+
+    <dependency>
+      <groupId>org.slf4j</groupId>
+      <artifactId>slf4j-log4j12</artifactId>
+      <scope>test</scope>
+    </dependency>
+
+    <dependency>
+      <groupId>org.mockito</groupId>
+      <artifactId>mockito-all</artifactId>
+      <scope>test</scope>
+    </dependency>
+
+  </dependencies>
+
+</project>
diff --git a/code/flume-ng-sinks/flume-dataset-sink/src/main/java/org/apache/flume/sink/kite/DatasetSink.java b/code/flume-ng-sinks/flume-dataset-sink/src/main/java/org/apache/flume/sink/kite/DatasetSink.java
new file mode 100644
index 0000000..fa31262
--- /dev/null
+++ b/code/flume-ng-sinks/flume-dataset-sink/src/main/java/org/apache/flume/sink/kite/DatasetSink.java
@@ -0,0 +1,582 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flume.sink.kite;
+
+import org.apache.flume.auth.FlumeAuthenticationUtil;
+import org.apache.flume.auth.PrivilegedExecutor;
+import org.apache.flume.sink.kite.parser.EntityParserFactory;
+import org.apache.flume.sink.kite.parser.EntityParser;
+import org.apache.flume.sink.kite.policy.FailurePolicy;
+import org.apache.flume.sink.kite.policy.FailurePolicyFactory;
+import com.google.common.annotations.VisibleForTesting;
+import com.google.common.base.Preconditions;
+import com.google.common.base.Throwables;
+import com.google.common.collect.Lists;
+
+import java.net.URI;
+import java.security.PrivilegedAction;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+import org.apache.avro.Schema;
+import org.apache.avro.file.DataFileWriter;
+import org.apache.avro.generic.GenericRecord;
+import org.apache.flume.Channel;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.EventDeliveryException;
+import org.apache.flume.Transaction;
+import org.apache.flume.conf.Configurable;
+import org.apache.flume.instrumentation.SinkCounter;
+import org.apache.flume.sink.AbstractSink;
+import org.kitesdk.data.Dataset;
+import org.kitesdk.data.DatasetDescriptor;
+import org.kitesdk.data.DatasetIOException;
+import org.kitesdk.data.DatasetNotFoundException;
+import org.kitesdk.data.DatasetWriter;
+import org.kitesdk.data.Datasets;
+import org.kitesdk.data.Flushable;
+import org.kitesdk.data.Syncable;
+import org.kitesdk.data.View;
+import org.kitesdk.data.spi.Registration;
+import org.kitesdk.data.URIBuilder;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import static org.apache.flume.sink.kite.DatasetSinkConstants.*;
+import org.kitesdk.data.Format;
+import org.kitesdk.data.Formats;
+
+/**
+ * Sink that writes events to a Kite Dataset. This sink will parse the body of
+ * each incoming event and store the resulting entity in a Kite Dataset. It
+ * determines the destination Dataset by opening a dataset URI
+ * {@code kite.dataset.uri} or opening a repository URI, {@code kite.repo.uri},
+ * and loading a Dataset by name, {@code kite.dataset.name}, and namespace,
+ * {@code kite.dataset.namespace}.
+ */
+public class DatasetSink extends AbstractSink implements Configurable {
+
+  private static final Logger LOG = LoggerFactory.getLogger(DatasetSink.class);
+
+  private Context context = null;
+  private PrivilegedExecutor privilegedExecutor;
+
+  private String datasetName = null;
+  private URI datasetUri = null;
+  private Schema datasetSchema = null;
+  private DatasetWriter<GenericRecord> writer = null;
+
+  /**
+   * The number of events to process as a single batch.
+   */
+  private long batchSize = DEFAULT_BATCH_SIZE;
+
+  /**
+   * The number of seconds to wait before rolling a writer.
+   */
+  private int rollIntervalSeconds = DEFAULT_ROLL_INTERVAL;
+
+  /**
+   * Flag that says if Flume should commit on every batch.
+   */
+  private boolean commitOnBatch = DEFAULT_FLUSHABLE_COMMIT_ON_BATCH;
+
+  /**
+   * Flag that says if Flume should sync on every batch.
+   */
+  private boolean syncOnBatch = DEFAULT_SYNCABLE_SYNC_ON_BATCH;
+
+  /**
+   * The last time the writer rolled.
+   */
+  private long lastRolledMillis = 0L;
+
+  /**
+   * The raw number of bytes parsed.
+   */
+  private long bytesParsed = 0L;
+
+  /**
+   * A class for parsing Kite entities from Flume Events.
+   */
+  private EntityParser<GenericRecord> parser = null;
+
+  /**
+   * A class implementing a failure newPolicy for events that had a
+ non-recoverable error during processing.
+   */
+  private FailurePolicy failurePolicy = null;
+
+  private SinkCounter counter = null;
+
+  /**
+   * The Kite entity
+   */
+  private GenericRecord entity = null;
+  // TODO: remove this after PARQUET-62 is released
+  private boolean reuseEntity = true;
+
+  /**
+   * The Flume transaction. Used to keep transactions open across calls to
+   * process.
+   */
+  private Transaction transaction = null;
+
+  /**
+   * Internal flag on if there has been a batch of records committed. This is
+   * used during rollback to know if the current writer needs to be closed.
+   */
+  private boolean committedBatch = false;
+
+  // Factories
+  private static final EntityParserFactory ENTITY_PARSER_FACTORY =
+      new EntityParserFactory();
+  private static final FailurePolicyFactory FAILURE_POLICY_FACTORY =
+      new FailurePolicyFactory();
+
+  /**
+   * Return the list of allowed formats.
+   * @return The list of allowed formats.
+   */
+  protected List<String> allowedFormats() {
+    return Lists.newArrayList("avro", "parquet");
+  }
+
+  @Override
+  public void configure(Context context) {
+    this.context = context;
+
+    String principal = context.getString(AUTH_PRINCIPAL);
+    String keytab = context.getString(AUTH_KEYTAB);
+    String effectiveUser = context.getString(AUTH_PROXY_USER);
+
+    this.privilegedExecutor = FlumeAuthenticationUtil.getAuthenticator(
+            principal, keytab).proxyAs(effectiveUser);
+
+    // Get the dataset URI and name from the context
+    String datasetURI = context.getString(CONFIG_KITE_DATASET_URI);
+    if (datasetURI != null) {
+      this.datasetUri = URI.create(datasetURI);
+      this.datasetName = uriToName(datasetUri);
+    } else {
+      String repositoryURI = context.getString(CONFIG_KITE_REPO_URI);
+      Preconditions.checkNotNull(repositoryURI, "No dataset configured. Setting "
+          + CONFIG_KITE_DATASET_URI + " is required.");
+
+      this.datasetName = context.getString(CONFIG_KITE_DATASET_NAME);
+      Preconditions.checkNotNull(datasetName, "No dataset configured. Setting "
+          + CONFIG_KITE_DATASET_URI + " is required.");
+
+      String namespace = context.getString(CONFIG_KITE_DATASET_NAMESPACE,
+          DEFAULT_NAMESPACE);
+
+      this.datasetUri = new URIBuilder(repositoryURI, namespace, datasetName)
+          .build();
+    }
+    this.setName(datasetUri.toString());
+
+    if (context.getBoolean(CONFIG_SYNCABLE_SYNC_ON_BATCH,
+        DEFAULT_SYNCABLE_SYNC_ON_BATCH)) {
+      Preconditions.checkArgument(
+          context.getBoolean(CONFIG_FLUSHABLE_COMMIT_ON_BATCH,
+              DEFAULT_FLUSHABLE_COMMIT_ON_BATCH), "Configuration error: "
+                  + CONFIG_FLUSHABLE_COMMIT_ON_BATCH + " must be set to true when "
+                  + CONFIG_SYNCABLE_SYNC_ON_BATCH + " is set to true.");
+    }
+
+    // Create the configured failure failurePolicy
+    this.failurePolicy = FAILURE_POLICY_FACTORY.newPolicy(context);
+
+    // other configuration
+    this.batchSize = context.getLong(CONFIG_KITE_BATCH_SIZE,
+        DEFAULT_BATCH_SIZE);
+    this.rollIntervalSeconds = context.getInteger(CONFIG_KITE_ROLL_INTERVAL,
+        DEFAULT_ROLL_INTERVAL);
+
+    this.counter = new SinkCounter(datasetName);
+  }
+
+  @Override
+  public synchronized void start() {
+    this.lastRolledMillis = System.currentTimeMillis();
+    counter.start();
+    // signal that this sink is ready to process
+    LOG.info("Started DatasetSink " + getName());
+    super.start();
+  }
+
+  /**
+   * Causes the sink to roll at the next {@link #process()} call.
+   */
+  @VisibleForTesting
+  void roll() {
+    this.lastRolledMillis = 0L;
+  }
+
+  @VisibleForTesting
+  DatasetWriter<GenericRecord> getWriter() {
+    return writer;
+  }
+
+  @VisibleForTesting
+  void setWriter(DatasetWriter<GenericRecord> writer) {
+    this.writer = writer;
+  }
+
+  @VisibleForTesting
+  void setParser(EntityParser<GenericRecord> parser) {
+    this.parser = parser;
+  }
+
+  @VisibleForTesting
+  void setFailurePolicy(FailurePolicy failurePolicy) {
+    this.failurePolicy = failurePolicy;
+  }
+
+  @Override
+  public synchronized void stop() {
+    counter.stop();
+
+    try {
+      // Close the writer and commit the transaction, but don't create a new
+      // writer since we're stopping
+      closeWriter();
+      commitTransaction();
+    } catch (EventDeliveryException ex) {
+      rollbackTransaction();
+
+      LOG.warn("Closing the writer failed: " + ex.getLocalizedMessage());
+      LOG.debug("Exception follows.", ex);
+      // We don't propogate the exception as the transaction would have been
+      // rolled back and we can still finish stopping
+    }
+
+  // signal that this sink has stopped
+    LOG.info("Stopped dataset sink: " + getName());
+    super.stop();
+  }
+
+  @Override
+  public Status process() throws EventDeliveryException {
+    long processedEvents = 0;
+
+    try {
+      if (shouldRoll()) {
+        closeWriter();
+        commitTransaction();
+        createWriter();
+      }
+
+      // The writer shouldn't be null at this point
+      Preconditions.checkNotNull(writer,
+          "Can't process events with a null writer. This is likely a bug.");
+      Channel channel = getChannel();
+
+      // Enter the transaction boundary if we haven't already
+      enterTransaction(channel);
+
+      for (; processedEvents < batchSize; processedEvents += 1) {
+        Event event = channel.take();
+
+        if (event == null) {
+          // no events available in the channel
+          break;
+        }
+
+        write(event);
+      }
+
+      // commit transaction
+      if (commitOnBatch) {
+        // Flush/sync before commiting. A failure here will result in rolling back
+        // the transaction
+        if (syncOnBatch && writer instanceof Syncable) {
+          ((Syncable) writer).sync();
+        } else if (writer instanceof Flushable) {
+          ((Flushable) writer).flush();
+        }
+        boolean committed = commitTransaction();
+        Preconditions.checkState(committed,
+            "Tried to commit a batch when there was no transaction");
+        committedBatch |= committed;
+      }
+    } catch (Throwable th) {
+      // catch-all for any unhandled Throwable so that the transaction is
+      // correctly rolled back.
+      rollbackTransaction();
+
+      if (commitOnBatch && committedBatch) {
+        try {
+          closeWriter();
+        } catch (EventDeliveryException ex) {
+          LOG.warn("Error closing writer there may be temp files that need to"
+              + " be manually recovered: " + ex.getLocalizedMessage());
+          LOG.debug("Exception follows.", ex);
+        }
+      } else {
+        this.writer = null;
+      }
+
+      // handle the exception
+      Throwables.propagateIfInstanceOf(th, Error.class);
+      Throwables.propagateIfInstanceOf(th, EventDeliveryException.class);
+      throw new EventDeliveryException(th);
+    }
+
+    if (processedEvents == 0) {
+      counter.incrementBatchEmptyCount();
+      return Status.BACKOFF;
+    } else if (processedEvents < batchSize) {
+      counter.incrementBatchUnderflowCount();
+    } else {
+      counter.incrementBatchCompleteCount();
+    }
+
+    counter.addToEventDrainSuccessCount(processedEvents);
+
+    return Status.READY;
+  }
+
+  /**
+   * Parse the event using the entity parser and write the entity to the dataset.
+   *
+   * @param event The event to write
+   * @throws EventDeliveryException An error occurred trying to write to the
+                                dataset that couldn't or shouldn't be
+                                handled by the failure policy.
+   */
+  @VisibleForTesting
+  void write(Event event) throws EventDeliveryException {
+    try {
+      this.entity = parser.parse(event, reuseEntity ? entity : null);
+      this.bytesParsed += event.getBody().length;
+
+      // writeEncoded would be an optimization in some cases, but HBase
+      // will not support it and partitioned Datasets need to get partition
+      // info from the entity Object. We may be able to avoid the
+      // serialization round-trip otherwise.
+      writer.write(entity);
+    } catch (NonRecoverableEventException ex) {
+      failurePolicy.handle(event, ex);
+    } catch (DataFileWriter.AppendWriteException ex) {
+      failurePolicy.handle(event, ex);
+    } catch (RuntimeException ex) {
+      Throwables.propagateIfInstanceOf(ex, EventDeliveryException.class);
+      throw new EventDeliveryException(ex);
+    }
+  }
+
+  /**
+   * Create a new writer.
+   *
+   * This method also re-loads the dataset so updates to the configuration or
+   * a dataset created after Flume starts will be loaded.
+   *
+   * @throws EventDeliveryException There was an error creating the writer.
+   */
+  @VisibleForTesting
+  void createWriter() throws EventDeliveryException {
+    // reset the commited flag whenever a new writer is created
+    committedBatch = false;
+    try {
+      View<GenericRecord> view;
+
+      view = privilegedExecutor.execute(
+        new PrivilegedAction<Dataset<GenericRecord>>() {
+          @Override
+          public Dataset<GenericRecord> run() {
+            return Datasets.load(datasetUri);
+          }
+        });
+
+      DatasetDescriptor descriptor = view.getDataset().getDescriptor();
+      Format format = descriptor.getFormat();
+      Preconditions.checkArgument(allowedFormats().contains(format.getName()),
+          "Unsupported format: " + format.getName());
+
+      Schema newSchema = descriptor.getSchema();
+      if (datasetSchema == null || !newSchema.equals(datasetSchema)) {
+        this.datasetSchema = descriptor.getSchema();
+        // dataset schema has changed, create a new parser
+        parser = ENTITY_PARSER_FACTORY.newParser(datasetSchema, context);
+      }
+
+      this.reuseEntity = !(Formats.PARQUET.equals(format));
+
+      // TODO: Check that the format implements Flushable after CDK-863
+      // goes in. For now, just check that the Dataset is Avro format
+      this.commitOnBatch = context.getBoolean(CONFIG_FLUSHABLE_COMMIT_ON_BATCH,
+          DEFAULT_FLUSHABLE_COMMIT_ON_BATCH) && (Formats.AVRO.equals(format));
+
+      // TODO: Check that the format implements Syncable after CDK-863
+      // goes in. For now, just check that the Dataset is Avro format
+      this.syncOnBatch = context.getBoolean(CONFIG_SYNCABLE_SYNC_ON_BATCH,
+          DEFAULT_SYNCABLE_SYNC_ON_BATCH) && (Formats.AVRO.equals(format));
+
+      this.datasetName = view.getDataset().getName();
+
+      this.writer = view.newWriter();
+
+      // Reset the last rolled time and the metrics
+      this.lastRolledMillis = System.currentTimeMillis();
+      this.bytesParsed = 0L;
+    } catch (DatasetNotFoundException ex) {
+      throw new EventDeliveryException("Dataset " + datasetUri + " not found."
+          + " The dataset must be created before Flume can write to it.", ex);
+    } catch (RuntimeException ex) {
+      throw new EventDeliveryException("Error trying to open a new"
+          + " writer for dataset " + datasetUri, ex);
+    }
+  }
+
+  /**
+   * Return true if the sink should roll the writer.
+   *
+   * Currently, this is based on time since the last roll or if the current
+   * writer is null.
+   *
+   * @return True if and only if the sink should roll the writer
+   */
+  private boolean shouldRoll() {
+    long currentTimeMillis = System.currentTimeMillis();
+    long elapsedTimeSeconds = TimeUnit.MILLISECONDS.toSeconds(
+        currentTimeMillis - lastRolledMillis);
+
+    LOG.debug("Current time: {}, lastRolled: {}, diff: {} sec",
+        new Object[] {currentTimeMillis, lastRolledMillis, elapsedTimeSeconds});
+
+    return elapsedTimeSeconds >= rollIntervalSeconds || writer == null;
+  }
+
+  /**
+   * Close the current writer.
+   *
+   * This method always sets the current writer to null even if close fails.
+   * If this method throws an Exception, callers *must* rollback any active
+   * transaction to ensure that data is replayed.
+   *
+   * @throws EventDeliveryException
+   */
+  @VisibleForTesting
+  void closeWriter() throws EventDeliveryException {
+    if (writer != null) {
+      try {
+        writer.close();
+
+        long elapsedTimeSeconds = TimeUnit.MILLISECONDS.toSeconds(
+            System.currentTimeMillis() - lastRolledMillis);
+        LOG.info("Closed writer for {} after {} seconds and {} bytes parsed",
+            new Object[]{datasetUri, elapsedTimeSeconds, bytesParsed});
+      } catch (DatasetIOException ex) {
+        throw new EventDeliveryException("Check HDFS permissions/health. IO"
+            + " error trying to close the  writer for dataset " + datasetUri,
+            ex);
+      } catch (RuntimeException ex) {
+        throw new EventDeliveryException("Error trying to close the  writer for"
+            + " dataset " + datasetUri, ex);
+      } finally {
+        // If we failed to close the writer then we give up on it as we'll
+        // end up throwing an EventDeliveryException which will result in
+        // a transaction rollback and a replay of any events written during
+        // the current transaction. If commitOnBatch is true, you can still
+        // end up with orphaned temp files that have data to be recovered.
+        this.writer = null;
+        failurePolicy.close();
+      }
+    }
+  }
+
+  /**
+   * Enter the transaction boundary. This will either begin a new transaction
+   * if one didn't already exist. If we're already in a transaction boundary,
+   * then this method does nothing.
+   *
+   * @param channel The Sink's channel
+   * @throws EventDeliveryException There was an error starting a new batch
+   *                                with the failure policy.
+   */
+  private void enterTransaction(Channel channel) throws EventDeliveryException {
+    // There's no synchronization around the transaction instance because the
+    // Sink API states "the Sink#process() call is guaranteed to only
+    // be accessed  by a single thread". Technically other methods could be
+    // called concurrently, but the implementation of SinkRunner waits
+    // for the Thread running process() to end before calling stop()
+    if (transaction == null) {
+      this.transaction = channel.getTransaction();
+      transaction.begin();
+      failurePolicy = FAILURE_POLICY_FACTORY.newPolicy(context);
+    }
+  }
+
+  /**
+   * Commit and close the transaction.
+   *
+   * If this method throws an Exception the caller *must* ensure that the
+   * transaction is rolled back. Callers can roll back the transaction by
+   * calling {@link #rollbackTransaction()}.
+   *
+   * @return True if there was an open transaction and it was committed, false
+   *         otherwise.
+   * @throws EventDeliveryException There was an error ending the batch with
+   *                                the failure policy.
+   */
+  @VisibleForTesting
+  boolean commitTransaction() throws EventDeliveryException {
+    if (transaction != null) {
+      failurePolicy.sync();
+      transaction.commit();
+      transaction.close();
+      this.transaction = null;
+      return true;
+    } else {
+      return false;
+    }
+  }
+
+  /**
+   * Rollback the transaction. If there is a RuntimeException during rollback,
+   * it will be logged but the transaction instance variable will still be
+   * nullified.
+   */
+  private void rollbackTransaction() {
+    if (transaction != null) {
+      try {
+        // If the transaction wasn't committed before we got the exception, we
+        // need to rollback.
+        transaction.rollback();
+      } catch (RuntimeException ex) {
+        LOG.error("Transaction rollback failed: " + ex.getLocalizedMessage());
+        LOG.debug("Exception follows.", ex);
+      } finally {
+        transaction.close();
+        this.transaction = null;
+      }
+    }
+  }
+
+  /**
+   * Get the name of the dataset from the URI
+   *
+   * @param uri The dataset or view URI
+   * @return The dataset name
+   */
+  private static String uriToName(URI uri) {
+    return Registration.lookupDatasetUri(URI.create(
+        uri.getRawSchemeSpecificPart())).second().get("dataset");
+  }
+}
diff --git a/code/flume-ng-sinks/flume-dataset-sink/src/main/java/org/apache/flume/sink/kite/DatasetSinkConstants.java b/code/flume-ng-sinks/flume-dataset-sink/src/main/java/org/apache/flume/sink/kite/DatasetSinkConstants.java
new file mode 100644
index 0000000..af33304
--- /dev/null
+++ b/code/flume-ng-sinks/flume-dataset-sink/src/main/java/org/apache/flume/sink/kite/DatasetSinkConstants.java
@@ -0,0 +1,132 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flume.sink.kite;
+
+import org.kitesdk.data.URIBuilder;
+
+public class DatasetSinkConstants {
+  /**
+   * URI of the Kite Dataset
+   */
+  public static final String CONFIG_KITE_DATASET_URI = "kite.dataset.uri";
+
+  /**
+   * URI of the Kite DatasetRepository.
+   */
+  public static final String CONFIG_KITE_REPO_URI = "kite.repo.uri";
+
+  /**
+   * Name of the Kite Dataset to write into.
+   */
+  public static final String CONFIG_KITE_DATASET_NAME = "kite.dataset.name";
+
+  /**
+   * Namespace of the Kite Dataset to write into.
+   */
+  public static final String CONFIG_KITE_DATASET_NAMESPACE =
+      "kite.dataset.namespace";
+  public static final String DEFAULT_NAMESPACE = URIBuilder.NAMESPACE_DEFAULT;
+
+  /**
+   * Number of records to process from the incoming channel per call to process.
+   */
+  public static final String CONFIG_KITE_BATCH_SIZE = "kite.batchSize";
+  public static long DEFAULT_BATCH_SIZE = 100;
+
+  /**
+   * Maximum time to wait before finishing files.
+   */
+  public static final String CONFIG_KITE_ROLL_INTERVAL = "kite.rollInterval";
+  public static int DEFAULT_ROLL_INTERVAL = 30; // seconds
+
+  /**
+   * Flag for committing the Flume transaction on each batch for Flushable
+   * datasets. When set to false, Flume will only commit the transaction when
+   * roll interval has expired. Setting this to false requires enough space
+   * in the channel to handle all events delivered during the roll interval.
+   * Defaults to true.
+   */
+  public static final String CONFIG_FLUSHABLE_COMMIT_ON_BATCH =
+      "kite.flushable.commiteOnBatch";
+  public static boolean DEFAULT_FLUSHABLE_COMMIT_ON_BATCH = true;
+
+  /**
+   * Flag for syncing the DatasetWriter on each batch for Syncable
+   * datasets. Defaults to true.
+   */
+  public static final String CONFIG_SYNCABLE_SYNC_ON_BATCH =
+      "kite.syncable.syncOnBatch";
+  public static boolean DEFAULT_SYNCABLE_SYNC_ON_BATCH = true;
+
+  /**
+   * Parser used to parse Flume Events into Kite entities.
+   */
+  public static final String CONFIG_ENTITY_PARSER = "kite.entityParser";
+
+  /**
+   * Built-in entity parsers
+   */
+  public static final String AVRO_ENTITY_PARSER = "avro";
+  public static final String DEFAULT_ENTITY_PARSER = AVRO_ENTITY_PARSER;
+  public static final String[] AVAILABLE_PARSERS = new String[] {
+    AVRO_ENTITY_PARSER
+  };
+
+  /**
+   * Policy used to handle non-recoverable failures.
+   */
+  public static final String CONFIG_FAILURE_POLICY = "kite.failurePolicy";
+
+  /**
+   * Write non-recoverable Flume events to a Kite dataset.
+   */
+  public static final String SAVE_FAILURE_POLICY = "save";
+
+  /**
+   * The URI to write non-recoverable Flume events to in the case of an error.
+   * If the dataset doesn't exist, it will be created.
+   */
+  public static final String CONFIG_KITE_ERROR_DATASET_URI =
+      "kite.error.dataset.uri";
+
+  /**
+   * Retry non-recoverable Flume events. This will lead to a never ending cycle
+   * of failure, but matches the previous default semantics of the DatasetSink.
+   */
+  public static final String RETRY_FAILURE_POLICY = "retry";
+  public static final String DEFAULT_FAILURE_POLICY = RETRY_FAILURE_POLICY;
+  public static final String[] AVAILABLE_POLICIES = new String[] {
+    RETRY_FAILURE_POLICY,
+    SAVE_FAILURE_POLICY
+  };
+
+  /**
+   * Headers where avro schema information is expected.
+   */
+  public static final String AVRO_SCHEMA_LITERAL_HEADER =
+      "flume.avro.schema.literal";
+  public static final String AVRO_SCHEMA_URL_HEADER = "flume.avro.schema.url";
+
+  /**
+   * Hadoop authentication settings
+   */
+  public static final String AUTH_PROXY_USER = "auth.proxyUser";
+  public static final String AUTH_PRINCIPAL = "auth.kerberosPrincipal";
+  public static final String AUTH_KEYTAB = "auth.kerberosKeytab";
+}
diff --git a/code/flume-ng-sinks/flume-dataset-sink/src/main/java/org/apache/flume/sink/kite/NonRecoverableEventException.java b/code/flume-ng-sinks/flume-dataset-sink/src/main/java/org/apache/flume/sink/kite/NonRecoverableEventException.java
new file mode 100644
index 0000000..4373429
--- /dev/null
+++ b/code/flume-ng-sinks/flume-dataset-sink/src/main/java/org/apache/flume/sink/kite/NonRecoverableEventException.java
@@ -0,0 +1,53 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flume.sink.kite;
+
+
+/**
+ * A non-recoverable error trying to deliver the event.
+ * 
+ * Non-recoverable event delivery failures include:
+ * 
+ * 1. Error parsing the event body thrown from the {@link EntityParser}
+ * 2. A schema mismatch between the schema of an event and the schema of the
+ *    destination dataset.
+ * 3. A missing schema from the Event header when using the
+ *    {@link AvroEntityParser}.
+ */
+public class NonRecoverableEventException extends Exception {
+
+  private static final long serialVersionUID = 3485151222482254285L;
+
+  public NonRecoverableEventException() {
+    super();
+  }
+
+  public NonRecoverableEventException(String message) {
+    super(message);
+  }
+
+  public NonRecoverableEventException(String message, Throwable t) {
+    super(message, t);
+  }
+
+  public NonRecoverableEventException(Throwable t) {
+    super(t);
+  }
+
+}
diff --git a/code/flume-ng-sinks/flume-dataset-sink/src/main/java/org/apache/flume/sink/kite/parser/AvroParser.java b/code/flume-ng-sinks/flume-dataset-sink/src/main/java/org/apache/flume/sink/kite/parser/AvroParser.java
new file mode 100644
index 0000000..7c6a723
--- /dev/null
+++ b/code/flume-ng-sinks/flume-dataset-sink/src/main/java/org/apache/flume/sink/kite/parser/AvroParser.java
@@ -0,0 +1,208 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flume.sink.kite.parser;
+
+import com.google.common.base.Preconditions;
+import com.google.common.cache.CacheBuilder;
+import com.google.common.cache.CacheLoader;
+import com.google.common.cache.LoadingCache;
+import com.google.common.util.concurrent.UncheckedExecutionException;
+import java.io.IOException;
+import java.io.InputStream;
+import java.net.URI;
+import java.net.URL;
+import java.util.Locale;
+import java.util.Map;
+import java.util.concurrent.ExecutionException;
+import org.apache.avro.Schema;
+import org.apache.avro.generic.GenericDatumReader;
+import org.apache.avro.generic.GenericRecord;
+import org.apache.avro.io.BinaryDecoder;
+import org.apache.avro.io.DatumReader;
+import org.apache.avro.io.DecoderFactory;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.EventDeliveryException;
+import org.apache.flume.sink.kite.NonRecoverableEventException;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+
+import static org.apache.flume.sink.kite.DatasetSinkConstants.*;
+
+/**
+ * An {@link EntityParser} that parses Avro serialized bytes from an event.
+ * 
+ * The Avro schema used to serialize the data should be set as either a URL
+ * or literal in the flume.avro.schema.url or flume.avro.schema.literal event
+ * headers respectively.
+ */
+public class AvroParser implements EntityParser<GenericRecord> {
+
+  static Configuration conf = new Configuration();
+
+  /**
+   * A cache of literal schemas to avoid re-parsing the schema.
+   */
+  private static final LoadingCache<String, Schema> schemasFromLiteral =
+      CacheBuilder.newBuilder()
+      .build(new CacheLoader<String, Schema>() {
+        @Override
+        public Schema load(String literal) {
+          Preconditions.checkNotNull(literal,
+              "Schema literal cannot be null without a Schema URL");
+          return new Schema.Parser().parse(literal);
+        }
+      });
+
+  /**
+   * A cache of schemas retrieved by URL to avoid re-parsing the schema.
+   */
+  private static final LoadingCache<String, Schema> schemasFromURL =
+      CacheBuilder.newBuilder()
+      .build(new CacheLoader<String, Schema>() {
+        @Override
+        public Schema load(String url) throws IOException {
+          Schema.Parser parser = new Schema.Parser();
+          InputStream is = null;
+          try {
+            FileSystem fs = FileSystem.get(URI.create(url), conf);
+            if (url.toLowerCase(Locale.ENGLISH).startsWith("hdfs:/")) {
+              is = fs.open(new Path(url));
+            } else {
+              is = new URL(url).openStream();
+            }
+            return parser.parse(is);
+          } finally {
+            if (is != null) {
+              is.close();
+            }
+          }
+        }
+      });
+
+  /**
+   * The schema of the destination dataset.
+   * 
+   * Used as the reader schema during parsing.
+   */
+  private final Schema datasetSchema;
+
+  /**
+   * A cache of DatumReaders per schema.
+   */
+  private final LoadingCache<Schema, DatumReader<GenericRecord>> readers =
+      CacheBuilder.newBuilder()
+          .build(new CacheLoader<Schema, DatumReader<GenericRecord>>() {
+            @Override
+            public DatumReader<GenericRecord> load(Schema schema) {
+              // must use the target dataset's schema for reading to ensure the
+              // records are able to be stored using it
+              return new GenericDatumReader<GenericRecord>(
+                  schema, datasetSchema);
+            }
+          });
+
+  /**
+   * The binary decoder to reuse for event parsing.
+   */
+  private BinaryDecoder decoder = null;
+
+  /**
+   * Create a new AvroParser given the schema of the destination dataset.
+   * 
+   * @param datasetSchema The schema of the destination dataset.
+   */
+  private AvroParser(Schema datasetSchema) {
+    this.datasetSchema = datasetSchema;
+  }
+
+  /**
+   * Parse the entity from the body of the given event.
+   * 
+   * @param event The event to parse.
+   * @param reuse If non-null, this may be reused and returned from this method.
+   * @return The parsed entity as a GenericRecord.
+   * @throws EventDeliveryException A recoverable error such as an error
+   *                                downloading the schema from the URL has
+   *                                occurred.
+   * @throws NonRecoverableEventException A non-recoverable error such as an
+   *                                      unparsable schema or entity has
+   *                                      occurred.
+   */
+  @Override
+  public GenericRecord parse(Event event, GenericRecord reuse)
+      throws EventDeliveryException, NonRecoverableEventException {
+    decoder = DecoderFactory.get().binaryDecoder(event.getBody(), decoder);
+
+    try {
+      DatumReader<GenericRecord> reader = readers.getUnchecked(schema(event));
+      return reader.read(reuse, decoder);
+    } catch (IOException ex) {
+      throw new NonRecoverableEventException("Cannot deserialize event", ex);
+    } catch (RuntimeException ex) {
+      throw new NonRecoverableEventException("Cannot deserialize event", ex);
+    }
+  }
+
+  /**
+   * Get the schema from the event headers.
+   * 
+   * @param event The Flume event
+   * @return The schema for the event
+   * @throws EventDeliveryException A recoverable error such as an error
+   *                                downloading the schema from the URL has
+   *                                occurred.
+   * @throws NonRecoverableEventException A non-recoverable error such as an
+   *                                      unparsable schema has occurred.
+   */
+  private static Schema schema(Event event) throws EventDeliveryException,
+      NonRecoverableEventException {
+    Map<String, String> headers = event.getHeaders();
+    String schemaURL = headers.get(AVRO_SCHEMA_URL_HEADER);
+    try {
+      if (schemaURL != null) {
+        return schemasFromURL.get(schemaURL);
+      } else {
+        String schemaLiteral = headers.get(AVRO_SCHEMA_LITERAL_HEADER);
+        if (schemaLiteral == null) {
+          throw new NonRecoverableEventException("No schema in event headers."
+              + " Headers must include either " + AVRO_SCHEMA_URL_HEADER
+              + " or " + AVRO_SCHEMA_LITERAL_HEADER);
+        }
+
+        return schemasFromLiteral.get(schemaLiteral);
+      }
+    } catch (ExecutionException ex) {
+      throw new EventDeliveryException("Cannot get schema", ex.getCause());
+    } catch (UncheckedExecutionException ex) {
+      throw new NonRecoverableEventException("Cannot parse schema",
+          ex.getCause());
+    }
+  }
+
+  public static class Builder implements EntityParser.Builder<GenericRecord> {
+
+    @Override
+    public EntityParser<GenericRecord> build(Schema datasetSchema, Context config) {
+      return new AvroParser(datasetSchema);
+    }
+    
+  }
+}
diff --git a/code/flume-ng-sinks/flume-dataset-sink/src/main/java/org/apache/flume/sink/kite/parser/EntityParser.java b/code/flume-ng-sinks/flume-dataset-sink/src/main/java/org/apache/flume/sink/kite/parser/EntityParser.java
new file mode 100644
index 0000000..f2051a2
--- /dev/null
+++ b/code/flume-ng-sinks/flume-dataset-sink/src/main/java/org/apache/flume/sink/kite/parser/EntityParser.java
@@ -0,0 +1,56 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flume.sink.kite.parser;
+
+import javax.annotation.concurrent.NotThreadSafe;
+import org.apache.avro.Schema;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.EventDeliveryException;
+import org.apache.flume.sink.kite.NonRecoverableEventException;
+
+@NotThreadSafe
+public interface EntityParser<E> {
+
+  /**
+   * Parse a Kite entity from a Flume event
+   *
+   * @param event The event to parse
+   * @param reuse If non-null, this may be reused and returned
+   * @return The parsed entity
+   * @throws EventDeliveryException A recoverable error during parsing. Parsing
+   *                                can be safely retried.
+   * @throws NonRecoverableEventException A non-recoverable error during
+   *                                      parsing. The event must be discarded.
+   *                                    
+   */
+  public E parse(Event event, E reuse) throws EventDeliveryException,
+      NonRecoverableEventException;
+
+  /**
+   * Knows how to build {@code EntityParser}s. Implementers must provide a
+   * no-arg constructor.
+   * 
+   * @param <E> The type of entities generated
+   */
+  public static interface Builder<E> {
+
+    public EntityParser<E> build(Schema datasetSchema, Context config);
+  }
+}
diff --git a/code/flume-ng-sinks/flume-dataset-sink/src/main/java/org/apache/flume/sink/kite/parser/EntityParserFactory.java b/code/flume-ng-sinks/flume-dataset-sink/src/main/java/org/apache/flume/sink/kite/parser/EntityParserFactory.java
new file mode 100644
index 0000000..3720ff3
--- /dev/null
+++ b/code/flume-ng-sinks/flume-dataset-sink/src/main/java/org/apache/flume/sink/kite/parser/EntityParserFactory.java
@@ -0,0 +1,81 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flume.sink.kite.parser;
+
+import java.util.Arrays;
+import org.apache.avro.Schema;
+import org.apache.avro.generic.GenericRecord;
+import org.apache.flume.Context;
+
+import static org.apache.flume.sink.kite.DatasetSinkConstants.*;
+
+public class EntityParserFactory {
+
+  public EntityParser<GenericRecord> newParser(Schema datasetSchema, Context config) {
+    EntityParser<GenericRecord> parser;
+
+    String parserType = config.getString(CONFIG_ENTITY_PARSER,
+        DEFAULT_ENTITY_PARSER);
+
+    if (parserType.equals(AVRO_ENTITY_PARSER)) {
+      parser = new AvroParser.Builder().build(datasetSchema, config);
+    } else {
+
+      Class<? extends EntityParser.Builder> builderClass;
+      Class c;
+      try {
+        c = Class.forName(parserType);
+      } catch (ClassNotFoundException ex) {
+        throw new IllegalArgumentException("EntityParser.Builder class "
+            + parserType + " not found. Must set " + CONFIG_ENTITY_PARSER
+            + " to a class that implements EntityParser.Builder or to a builtin"
+            + " parser: " + Arrays.toString(AVAILABLE_PARSERS), ex);
+      }
+
+      if (c != null && EntityParser.Builder.class.isAssignableFrom(c)) {
+        builderClass = c;
+      } else {
+        throw new IllegalArgumentException("Class " + parserType + " does not"
+            + " implement EntityParser.Builder. Must set "
+            + CONFIG_ENTITY_PARSER + " to a class that extends"
+            + " EntityParser.Builder or to a builtin parser: "
+            + Arrays.toString(AVAILABLE_PARSERS));
+      }
+
+      EntityParser.Builder<GenericRecord> builder;
+      try {
+        builder = builderClass.newInstance();
+      } catch (InstantiationException ex) {
+        throw new IllegalArgumentException("Can't instantiate class "
+            + parserType + ". Must set " + CONFIG_ENTITY_PARSER + " to a class"
+            + " that extends EntityParser.Builder or to a builtin parser: "
+            + Arrays.toString(AVAILABLE_PARSERS), ex);
+      } catch (IllegalAccessException ex) {
+        throw new IllegalArgumentException("Can't instantiate class "
+            + parserType + ". Must set " + CONFIG_ENTITY_PARSER + " to a class"
+            + " that extends EntityParser.Builder or to a builtin parser: "
+            + Arrays.toString(AVAILABLE_PARSERS), ex);
+      }
+
+      parser = builder.build(datasetSchema, config);
+    }
+
+    return parser;
+  }
+}
diff --git a/code/flume-ng-sinks/flume-dataset-sink/src/main/java/org/apache/flume/sink/kite/policy/FailurePolicy.java b/code/flume-ng-sinks/flume-dataset-sink/src/main/java/org/apache/flume/sink/kite/policy/FailurePolicy.java
new file mode 100644
index 0000000..f6f875a
--- /dev/null
+++ b/code/flume-ng-sinks/flume-dataset-sink/src/main/java/org/apache/flume/sink/kite/policy/FailurePolicy.java
@@ -0,0 +1,105 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flume.sink.kite.policy;
+
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.EventDeliveryException;
+import org.apache.flume.sink.kite.DatasetSink;
+import org.kitesdk.data.Syncable;
+
+/**
+ * A policy for dealing with non-recoverable event delivery failures.
+ *
+ * Non-recoverable event delivery failures include:
+ *
+ * 1. Error parsing the event body thrown from the {@link EntityParser}
+ * 2. A schema mismatch between the schema of an event and the schema of the
+ *    destination dataset.
+ * 3. A missing schema from the Event header when using the
+ *    {@link AvroEntityParser}.
+ *
+ * The life cycle of a FailurePolicy mimics the life cycle of the
+ * {@link DatasetSink#writer}:
+ *
+ * 1. When a new writer is created, the policy will be instantiated.
+ * 2. As Event failures happen,
+ *    {@link #handle(org.apache.flume.Event, java.lang.Throwable)} will be
+ *    called to let the policy handle the failure.
+ * 3. If the {@link DatasetSink} is configured to commit on batch, then the
+ *    {@link #sync()} method will be called when the batch is committed.
+ * 4. When the writer is closed, the policy's {@link #close()} method will be
+ *    called.
+ */
+public interface FailurePolicy {
+
+  /**
+   * Handle a non-recoverable event.
+   *
+   * @param event The event
+   * @param cause The cause of the failure
+   * @throws EventDeliveryException The policy failed to handle the event. When
+   *                                this is thrown, the Flume transaction will
+   *                                be rolled back and the event will be retried
+   *                                along with the rest of the batch.
+   */
+  public void handle(Event event, Throwable cause)
+      throws EventDeliveryException;
+
+  /**
+   * Ensure any handled events are on stable storage.
+   *
+   * This allows the policy implementation to sync any data that it may not
+   * have fully handled.
+   *
+   * See {@link Syncable#sync()}.
+   *
+   * @throws EventDeliveryException The policy failed while syncing data.
+   *                                When this is thrown, the Flume transaction
+   *                                will be rolled back and the batch will be
+   *                                retried.
+   */
+  public void sync() throws EventDeliveryException;
+
+  /**
+   * Close this FailurePolicy and release any resources.
+   *
+   * @throws EventDeliveryException The policy failed while closing resources.
+   *                                When this is thrown, the Flume transaction
+   *                                will be rolled back and the batch will be
+   *                                retried.
+   */
+  public void close() throws EventDeliveryException;
+
+  /**
+   * Knows how to build {@code FailurePolicy}s. Implementers must provide a
+   * no-arg constructor.
+   */
+  public static interface Builder {
+
+    /**
+     * Build a new {@code FailurePolicy}
+     *
+     * @param config The Flume configuration context
+     * @return The {@code FailurePolicy}
+     */
+    FailurePolicy build(Context config);
+  }
+
+}
diff --git a/code/flume-ng-sinks/flume-dataset-sink/src/main/java/org/apache/flume/sink/kite/policy/FailurePolicyFactory.java b/code/flume-ng-sinks/flume-dataset-sink/src/main/java/org/apache/flume/sink/kite/policy/FailurePolicyFactory.java
new file mode 100644
index 0000000..d3b1fe8
--- /dev/null
+++ b/code/flume-ng-sinks/flume-dataset-sink/src/main/java/org/apache/flume/sink/kite/policy/FailurePolicyFactory.java
@@ -0,0 +1,81 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flume.sink.kite.policy;
+
+import java.util.Arrays;
+import org.apache.flume.Context;
+
+import static org.apache.flume.sink.kite.DatasetSinkConstants.*;
+
+public class FailurePolicyFactory {
+
+  public FailurePolicy newPolicy(Context config) {
+    FailurePolicy policy;
+
+    String policyType = config.getString(CONFIG_FAILURE_POLICY,
+        DEFAULT_FAILURE_POLICY);
+
+    if (policyType.equals(RETRY_FAILURE_POLICY)) {
+      policy = new RetryPolicy.Builder().build(config);
+    } else if (policyType.equals(SAVE_FAILURE_POLICY)) {
+      policy = new SavePolicy.Builder().build(config);
+    } else {
+
+      Class<? extends FailurePolicy.Builder> builderClass;
+      Class c;
+      try {
+        c = Class.forName(policyType);
+      } catch (ClassNotFoundException ex) {
+        throw new IllegalArgumentException("FailurePolicy.Builder class "
+            + policyType + " not found. Must set " + CONFIG_FAILURE_POLICY
+            + " to a class that implements FailurePolicy.Builder or to a builtin"
+            + " policy: " + Arrays.toString(AVAILABLE_POLICIES), ex);
+      }
+
+      if (c != null && FailurePolicy.Builder.class.isAssignableFrom(c)) {
+        builderClass = c;
+      } else {
+        throw new IllegalArgumentException("Class " + policyType + " does not"
+            + " implement FailurePolicy.Builder. Must set "
+            + CONFIG_FAILURE_POLICY + " to a class that extends"
+            + " FailurePolicy.Builder or to a builtin policy: "
+            + Arrays.toString(AVAILABLE_POLICIES));
+      }
+
+      FailurePolicy.Builder builder;
+      try {
+        builder = builderClass.newInstance();
+      } catch (InstantiationException ex) {
+        throw new IllegalArgumentException("Can't instantiate class "
+            + policyType + ". Must set " + CONFIG_FAILURE_POLICY + " to a class"
+            + " that extends FailurePolicy.Builder or to a builtin policy: "
+            + Arrays.toString(AVAILABLE_POLICIES), ex);
+      } catch (IllegalAccessException ex) {
+        throw new IllegalArgumentException("Can't instantiate class "
+            + policyType + ". Must set " + CONFIG_FAILURE_POLICY + " to a class"
+            + " that extends FailurePolicy.Builder or to a builtin policy: "
+            + Arrays.toString(AVAILABLE_POLICIES), ex);
+      }
+
+      policy = builder.build(config);
+    }
+   
+    return policy;
+  }
+}
diff --git a/code/flume-ng-sinks/flume-dataset-sink/src/main/java/org/apache/flume/sink/kite/policy/RetryPolicy.java b/code/flume-ng-sinks/flume-dataset-sink/src/main/java/org/apache/flume/sink/kite/policy/RetryPolicy.java
new file mode 100644
index 0000000..9a4991c
--- /dev/null
+++ b/code/flume-ng-sinks/flume-dataset-sink/src/main/java/org/apache/flume/sink/kite/policy/RetryPolicy.java
@@ -0,0 +1,63 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flume.sink.kite.policy;
+
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.EventDeliveryException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * A failure policy that logs the error and then forces a retry by throwing
+ * {@link EventDeliveryException}.
+ */
+public class RetryPolicy implements FailurePolicy {
+  private static final Logger LOG = LoggerFactory.getLogger(RetryPolicy.class);
+
+  private RetryPolicy() {
+  }
+
+  @Override
+  public void handle(Event event, Throwable cause) throws EventDeliveryException {
+    LOG.error("Event delivery failed: " + cause.getLocalizedMessage());
+    LOG.debug("Exception follows.", cause);
+
+    throw new EventDeliveryException(cause);
+  }
+
+  @Override
+  public void sync() throws EventDeliveryException {
+    // do nothing
+  }
+
+  @Override
+  public void close() throws EventDeliveryException {
+    // do nothing
+  }
+
+  public static class Builder implements FailurePolicy.Builder {
+
+    @Override
+    public FailurePolicy build(Context config) {
+      return new RetryPolicy();
+    }
+
+  }
+}
diff --git a/code/flume-ng-sinks/flume-dataset-sink/src/main/java/org/apache/flume/sink/kite/policy/SavePolicy.java b/code/flume-ng-sinks/flume-dataset-sink/src/main/java/org/apache/flume/sink/kite/policy/SavePolicy.java
new file mode 100644
index 0000000..bd537ec
--- /dev/null
+++ b/code/flume-ng-sinks/flume-dataset-sink/src/main/java/org/apache/flume/sink/kite/policy/SavePolicy.java
@@ -0,0 +1,128 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flume.sink.kite.policy;
+
+import com.google.common.base.Preconditions;
+import com.google.common.collect.Maps;
+import java.nio.ByteBuffer;
+import java.util.Map;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.EventDeliveryException;
+import org.apache.flume.source.avro.AvroFlumeEvent;
+import org.kitesdk.data.DatasetDescriptor;
+import org.kitesdk.data.DatasetWriter;
+import org.kitesdk.data.Datasets;
+import org.kitesdk.data.Formats;
+import org.kitesdk.data.Syncable;
+import org.kitesdk.data.View;
+
+import static org.apache.flume.sink.kite.DatasetSinkConstants.*;
+
+/**
+ * A failure policy that writes the raw Flume event to a Kite dataset.
+ */
+public class SavePolicy implements FailurePolicy {
+
+  private final View<AvroFlumeEvent> dataset;
+  private DatasetWriter<AvroFlumeEvent> writer;
+  private int nEventsHandled;
+
+  private SavePolicy(Context context) {
+    String uri = context.getString(CONFIG_KITE_ERROR_DATASET_URI);
+    Preconditions.checkArgument(uri != null, "Must set "
+        + CONFIG_KITE_ERROR_DATASET_URI + " when " + CONFIG_FAILURE_POLICY
+        + "=save");
+    if (Datasets.exists(uri)) {
+      dataset = Datasets.load(uri, AvroFlumeEvent.class);
+    } else {
+      DatasetDescriptor descriptor = new DatasetDescriptor.Builder()
+          .schema(AvroFlumeEvent.class)
+          .build();
+      dataset = Datasets.create(uri, descriptor, AvroFlumeEvent.class);
+    }
+
+    nEventsHandled = 0;
+  }
+
+  @Override
+  public void handle(Event event, Throwable cause) throws EventDeliveryException {
+    try {
+      if (writer == null) {
+        writer = dataset.newWriter();
+      }
+
+      final AvroFlumeEvent avroEvent = new AvroFlumeEvent();
+      avroEvent.setBody(ByteBuffer.wrap(event.getBody()));
+      avroEvent.setHeaders(toCharSeqMap(event.getHeaders()));
+
+      writer.write(avroEvent);
+      nEventsHandled++;
+    } catch (RuntimeException ex) {
+      throw new EventDeliveryException(ex);
+    }
+  }
+
+  @Override
+  public void sync() throws EventDeliveryException {
+    if (nEventsHandled > 0) {
+      if (Formats.PARQUET.equals(
+          dataset.getDataset().getDescriptor().getFormat())) {
+        // We need to close the writer on sync if we're writing to a Parquet
+        // dataset
+        close();
+      } else {
+        if (writer instanceof Syncable) {
+          ((Syncable) writer).sync();
+        }
+      }
+    }
+  }
+
+  @Override
+  public void close() throws EventDeliveryException {
+    if (nEventsHandled > 0) {
+      try {
+        writer.close();
+      } catch (RuntimeException ex) {
+        throw new EventDeliveryException(ex);
+      } finally {
+        writer = null;
+        nEventsHandled = 0;
+      }
+    }
+  }
+
+  /**
+   * Helper function to convert a map of String to a map of CharSequence.
+   */
+  private static Map<CharSequence, CharSequence> toCharSeqMap(
+      Map<String, String> map) {
+    return Maps.<CharSequence, CharSequence>newHashMap(map);
+  }
+
+  public static class Builder implements FailurePolicy.Builder {
+
+    @Override
+    public FailurePolicy build(Context config) {
+      return new SavePolicy(config);
+    }
+
+  }
+}
diff --git a/code/flume-ng-sinks/flume-dataset-sink/src/test/java/org/apache/flume/sink/kite/TestDatasetSink.java b/code/flume-ng-sinks/flume-dataset-sink/src/test/java/org/apache/flume/sink/kite/TestDatasetSink.java
new file mode 100644
index 0000000..3709577
--- /dev/null
+++ b/code/flume-ng-sinks/flume-dataset-sink/src/test/java/org/apache/flume/sink/kite/TestDatasetSink.java
@@ -0,0 +1,1036 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flume.sink.kite;
+
+import com.google.common.base.Function;
+import com.google.common.base.Throwables;
+import com.google.common.collect.Iterables;
+import com.google.common.collect.Lists;
+import com.google.common.collect.Maps;
+import com.google.common.collect.Sets;
+import org.apache.avro.Schema;
+import org.apache.avro.file.DataFileWriter;
+import org.apache.avro.generic.GenericData;
+import org.apache.avro.generic.GenericRecord;
+import org.apache.avro.generic.GenericRecordBuilder;
+import org.apache.avro.io.Encoder;
+import org.apache.avro.io.EncoderFactory;
+import org.apache.avro.reflect.ReflectDatumWriter;
+import org.apache.avro.util.Utf8;
+import org.apache.commons.io.FileUtils;
+import org.apache.flume.Channel;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.EventDeliveryException;
+import org.apache.flume.Transaction;
+import org.apache.flume.channel.MemoryChannel;
+import org.apache.flume.conf.Configurables;
+import org.apache.flume.event.SimpleEvent;
+import org.apache.flume.sink.kite.parser.EntityParser;
+import org.apache.flume.sink.kite.policy.FailurePolicy;
+import org.apache.flume.source.avro.AvroFlumeEvent;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.hdfs.MiniDFSCluster;
+import org.junit.After;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.BeforeClass;
+import org.junit.Test;
+import org.kitesdk.data.Dataset;
+import org.kitesdk.data.DatasetDescriptor;
+import org.kitesdk.data.DatasetReader;
+import org.kitesdk.data.DatasetWriter;
+import org.kitesdk.data.Datasets;
+import org.kitesdk.data.PartitionStrategy;
+import org.kitesdk.data.View;
+
+import javax.annotation.Nullable;
+import java.io.ByteArrayOutputStream;
+import java.io.File;
+import java.io.FileWriter;
+import java.io.IOException;
+import java.net.URI;
+import java.nio.ByteBuffer;
+import java.util.Arrays;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.Callable;
+
+import static org.mockito.Mockito.any;
+import static org.mockito.Mockito.doThrow;
+import static org.mockito.Mockito.eq;
+import static org.mockito.Mockito.mock;
+import static org.mockito.Mockito.never;
+import static org.mockito.Mockito.verify;
+import static org.mockito.Mockito.when;
+
+public class TestDatasetSink {
+
+  public static final String FILE_REPO_URI = "repo:file:target/test_repo";
+  public static final String DATASET_NAME = "test";
+  public static final String FILE_DATASET_URI =
+      "dataset:file:target/test_repo/" + DATASET_NAME;
+  public static final String ERROR_DATASET_URI =
+      "dataset:file:target/test_repo/failed_events";
+  public static final File SCHEMA_FILE = new File("target/record-schema.avsc");
+  public static final Schema RECORD_SCHEMA = new Schema.Parser().parse(
+      "{\"type\":\"record\",\"name\":\"rec\",\"fields\":[" +
+          "{\"name\":\"id\",\"type\":\"string\"}," +
+          "{\"name\":\"msg\",\"type\":[\"string\",\"null\"]," +
+              "\"default\":\"default\"}]}");
+  public static final Schema COMPATIBLE_SCHEMA = new Schema.Parser().parse(
+      "{\"type\":\"record\",\"name\":\"rec\",\"fields\":[" +
+          "{\"name\":\"id\",\"type\":\"string\"}]}");
+  public static final Schema INCOMPATIBLE_SCHEMA = new Schema.Parser().parse(
+      "{\"type\":\"record\",\"name\":\"user\",\"fields\":[" +
+          "{\"name\":\"username\",\"type\":\"string\"}]}");
+  public static final Schema UPDATED_SCHEMA = new Schema.Parser().parse(
+      "{\"type\":\"record\",\"name\":\"rec\",\"fields\":[" +
+          "{\"name\":\"id\",\"type\":\"string\"}," +
+          "{\"name\":\"priority\",\"type\":\"int\", \"default\": 0}," +
+          "{\"name\":\"msg\",\"type\":[\"string\",\"null\"]," +
+          "\"default\":\"default\"}]}");
+  public static final DatasetDescriptor DESCRIPTOR = new DatasetDescriptor
+      .Builder()
+      .schema(RECORD_SCHEMA)
+      .build();
+
+  Context config = null;
+  Channel in = null;
+  List<GenericRecord> expected = null;
+  private static final String DFS_DIR = "target/test/dfs";
+  private static final String TEST_BUILD_DATA_KEY = "test.build.data";
+  private static String oldTestBuildDataProp = null;
+
+  @BeforeClass
+  public static void saveSchema() throws IOException {
+    oldTestBuildDataProp = System.getProperty(TEST_BUILD_DATA_KEY);
+    System.setProperty(TEST_BUILD_DATA_KEY, DFS_DIR);
+    FileWriter schema = new FileWriter(SCHEMA_FILE);
+    schema.append(RECORD_SCHEMA.toString());
+    schema.close();
+  }
+
+  @AfterClass
+  public static void tearDownClass() {
+    FileUtils.deleteQuietly(new File(DFS_DIR));
+    if (oldTestBuildDataProp != null) {
+      System.setProperty(TEST_BUILD_DATA_KEY, oldTestBuildDataProp);
+    }
+  }
+
+  @Before
+  public void setup() throws EventDeliveryException {
+    Datasets.delete(FILE_DATASET_URI);
+    Datasets.create(FILE_DATASET_URI, DESCRIPTOR);
+
+    this.config = new Context();
+    config.put("keep-alive", "0");
+    this.in = new MemoryChannel();
+    Configurables.configure(in, config);
+
+    config.put(DatasetSinkConstants.CONFIG_KITE_DATASET_URI, FILE_DATASET_URI);
+
+    GenericRecordBuilder builder = new GenericRecordBuilder(RECORD_SCHEMA);
+    expected = Lists.<GenericRecord>newArrayList(
+        builder.set("id", "1").set("msg", "msg1").build(),
+        builder.set("id", "2").set("msg", "msg2").build(),
+        builder.set("id", "3").set("msg", "msg3").build());
+
+    putToChannel(in, Iterables.transform(expected,
+        new Function<GenericRecord, Event>() {
+          private int i = 0;
+
+          @Override
+          public Event apply(@Nullable GenericRecord rec) {
+            this.i += 1;
+            boolean useURI = (i % 2) == 0;
+            return event(rec, RECORD_SCHEMA, SCHEMA_FILE, useURI);
+          }
+        }));
+  }
+
+  @After
+  public void teardown() {
+    Datasets.delete(FILE_DATASET_URI);
+  }
+
+  @Test
+  public void testOldConfig() throws EventDeliveryException {
+    config.put(DatasetSinkConstants.CONFIG_KITE_DATASET_URI, null);
+    config.put(DatasetSinkConstants.CONFIG_KITE_REPO_URI, FILE_REPO_URI);
+    config.put(DatasetSinkConstants.CONFIG_KITE_DATASET_NAME, DATASET_NAME);
+
+    DatasetSink sink = sink(in, config);
+
+    // run the sink
+    sink.start();
+    sink.process();
+    sink.stop();
+
+    Assert.assertEquals(
+        Sets.newHashSet(expected),
+        read(Datasets.load(FILE_DATASET_URI)));
+    Assert.assertEquals("Should have committed", 0, remaining(in));
+  }
+
+  @Test
+  public void testDatasetUriOverridesOldConfig() throws EventDeliveryException {
+    // CONFIG_KITE_DATASET_URI is still set, otherwise this will cause an error
+    config.put(DatasetSinkConstants.CONFIG_KITE_REPO_URI, "bad uri");
+    config.put(DatasetSinkConstants.CONFIG_KITE_DATASET_NAME, "");
+
+    DatasetSink sink = sink(in, config);
+
+    // run the sink
+    sink.start();
+    sink.process();
+    sink.stop();
+
+    Assert.assertEquals(
+        Sets.newHashSet(expected),
+        read(Datasets.load(FILE_DATASET_URI)));
+    Assert.assertEquals("Should have committed", 0, remaining(in));
+  }
+
+  @Test
+  public void testFileStore()
+      throws EventDeliveryException, NonRecoverableEventException, NonRecoverableEventException {
+    DatasetSink sink = sink(in, config);
+
+    // run the sink
+    sink.start();
+    sink.process();
+    sink.stop();
+
+    Assert.assertEquals(
+        Sets.newHashSet(expected),
+        read(Datasets.load(FILE_DATASET_URI)));
+    Assert.assertEquals("Should have committed", 0, remaining(in));
+  }
+
+  @Test
+  public void testParquetDataset() throws EventDeliveryException {
+    Datasets.delete(FILE_DATASET_URI);
+    Dataset<GenericRecord> created = Datasets.create(FILE_DATASET_URI,
+        new DatasetDescriptor.Builder(DESCRIPTOR)
+            .format("parquet")
+            .build());
+
+    DatasetSink sink = sink(in, config);
+
+    // run the sink
+    sink.start();
+    sink.process();
+
+    // the transaction should not commit during the call to process
+    assertThrows("Transaction should still be open", IllegalStateException.class,
+        new Callable() {
+          @Override
+          public Object call() throws EventDeliveryException {
+            in.getTransaction().begin();
+            return null;
+          }
+        });
+    // The records won't commit until the call to stop()
+    Assert.assertEquals("Should not have committed", 0, read(created).size());
+
+    sink.stop();
+
+    Assert.assertEquals(Sets.newHashSet(expected), read(created));
+    Assert.assertEquals("Should have committed", 0, remaining(in));
+  }
+
+  @Test
+  public void testPartitionedData() throws EventDeliveryException {
+    URI partitionedUri = URI.create("dataset:file:target/test_repo/partitioned");
+    try {
+      Datasets.create(partitionedUri, new DatasetDescriptor.Builder(DESCRIPTOR)
+          .partitionStrategy(new PartitionStrategy.Builder()
+              .identity("id", 10) // partition by id
+              .build())
+          .build());
+
+      config.put(DatasetSinkConstants.CONFIG_KITE_DATASET_URI,
+          partitionedUri.toString());
+      DatasetSink sink = sink(in, config);
+
+      // run the sink
+      sink.start();
+      sink.process();
+      sink.stop();
+
+      Assert.assertEquals(
+          Sets.newHashSet(expected),
+          read(Datasets.load(partitionedUri)));
+      Assert.assertEquals("Should have committed", 0, remaining(in));
+    } finally {
+      if (Datasets.exists(partitionedUri)) {
+        Datasets.delete(partitionedUri);
+      }
+    }
+  }
+
+  @Test
+  public void testStartBeforeDatasetCreated() throws EventDeliveryException {
+    // delete the dataset created by setup
+    Datasets.delete(FILE_DATASET_URI);
+
+    DatasetSink sink = sink(in, config);
+
+    // start the sink
+    sink.start();
+
+    // run the sink without a target dataset
+    try {
+      sink.process();
+      Assert.fail("Should have thrown an exception: no such dataset");
+    } catch (EventDeliveryException e) {
+      // expected
+    }
+
+    // create the target dataset
+    Datasets.create(FILE_DATASET_URI, DESCRIPTOR);
+
+    // run the sink
+    sink.process();
+    sink.stop();
+
+    Assert.assertEquals(Sets.newHashSet(expected), read(Datasets.load(FILE_DATASET_URI)));
+    Assert.assertEquals("Should have committed", 0, remaining(in));
+  }
+
+  @Test
+  public void testDatasetUpdate() throws EventDeliveryException {
+    // add an updated record that is missing the msg field
+    GenericRecordBuilder updatedBuilder = new GenericRecordBuilder(UPDATED_SCHEMA);
+    GenericData.Record updatedRecord = updatedBuilder
+        .set("id", "0")
+        .set("priority", 1)
+        .set("msg", "Priority 1 message!")
+        .build();
+
+    // make a set of the expected records with the new schema
+    Set<GenericRecord> expectedAsUpdated = Sets.newHashSet();
+    for (GenericRecord record : expected) {
+      expectedAsUpdated.add(updatedBuilder
+          .clear("priority")
+          .set("id", record.get("id"))
+          .set("msg", record.get("msg"))
+          .build());
+    }
+    expectedAsUpdated.add(updatedRecord);
+
+    DatasetSink sink = sink(in, config);
+
+    // run the sink
+    sink.start();
+    sink.process();
+
+    // update the dataset's schema
+    DatasetDescriptor updated = new DatasetDescriptor
+        .Builder(Datasets.load(FILE_DATASET_URI).getDataset().getDescriptor())
+        .schema(UPDATED_SCHEMA)
+        .build();
+    Datasets.update(FILE_DATASET_URI, updated);
+
+    // trigger a roll on the next process call to refresh the writer
+    sink.roll();
+
+    // add the record to the incoming channel and the expected list
+    putToChannel(in, event(updatedRecord, UPDATED_SCHEMA, null, false));
+
+    // process events with the updated schema
+    sink.process();
+    sink.stop();
+
+    Assert.assertEquals(expectedAsUpdated, read(Datasets.load(FILE_DATASET_URI)));
+    Assert.assertEquals("Should have committed", 0, remaining(in));
+  }
+
+  @Test
+  public void testMiniClusterStore() throws EventDeliveryException, IOException {
+    // setup a minicluster
+    MiniDFSCluster cluster = new MiniDFSCluster
+        .Builder(new Configuration())
+        .build();
+
+    FileSystem dfs = cluster.getFileSystem();
+    Configuration conf = dfs.getConf();
+
+    URI hdfsUri = URI.create(
+        "dataset:" + conf.get("fs.defaultFS") + "/tmp/repo" + DATASET_NAME);
+    try {
+      // create a repository and dataset in HDFS
+      Datasets.create(hdfsUri, DESCRIPTOR);
+
+      // update the config to use the HDFS repository
+      config.put(DatasetSinkConstants.CONFIG_KITE_DATASET_URI, hdfsUri.toString());
+
+      DatasetSink sink = sink(in, config);
+
+      // run the sink
+      sink.start();
+      sink.process();
+      sink.stop();
+
+      Assert.assertEquals(
+          Sets.newHashSet(expected),
+          read(Datasets.load(hdfsUri)));
+      Assert.assertEquals("Should have committed", 0, remaining(in));
+
+    } finally {
+      if (Datasets.exists(hdfsUri)) {
+        Datasets.delete(hdfsUri);
+      }
+      cluster.shutdown();
+    }
+  }
+
+  @Test
+  public void testBatchSize() throws EventDeliveryException {
+    DatasetSink sink = sink(in, config);
+
+    // release one record per process call
+    config.put("kite.batchSize", "2");
+    Configurables.configure(sink, config);
+
+    sink.start();
+    sink.process(); // process the first and second
+    sink.roll(); // roll at the next process call
+    sink.process(); // roll and process the third
+    Assert.assertEquals(
+        Sets.newHashSet(expected.subList(0, 2)),
+        read(Datasets.load(FILE_DATASET_URI)));
+    Assert.assertEquals("Should have committed", 0, remaining(in));
+    sink.roll(); // roll at the next process call
+    sink.process(); // roll, the channel is empty
+    Assert.assertEquals(
+        Sets.newHashSet(expected),
+        read(Datasets.load(FILE_DATASET_URI)));
+    sink.stop();
+  }
+
+  @Test
+  public void testTimedFileRolling()
+      throws EventDeliveryException, InterruptedException {
+    // use a new roll interval
+    config.put("kite.rollInterval", "1"); // in seconds
+
+    DatasetSink sink = sink(in, config);
+
+    Dataset<GenericRecord> records = Datasets.load(FILE_DATASET_URI);
+
+    // run the sink
+    sink.start();
+    sink.process();
+
+    Assert.assertEquals("Should have committed", 0, remaining(in));
+
+    Thread.sleep(1100); // sleep longer than the roll interval
+    sink.process(); // rolling happens in the process method
+
+    Assert.assertEquals(Sets.newHashSet(expected), read(records));
+
+    // wait until the end to stop because it would close the files
+    sink.stop();
+  }
+
+  @Test
+  public void testCompatibleSchemas() throws EventDeliveryException {
+    DatasetSink sink = sink(in, config);
+
+    // add a compatible record that is missing the msg field
+    GenericRecordBuilder compatBuilder = new GenericRecordBuilder(
+        COMPATIBLE_SCHEMA);
+    GenericData.Record compatibleRecord = compatBuilder.set("id", "0").build();
+
+    // add the record to the incoming channel
+    putToChannel(in, event(compatibleRecord, COMPATIBLE_SCHEMA, null, false));
+
+    // the record will be read using the real schema, so create the expected
+    // record using it, but without any data
+
+    GenericRecordBuilder builder = new GenericRecordBuilder(RECORD_SCHEMA);
+    GenericData.Record expectedRecord = builder.set("id", "0").build();
+    expected.add(expectedRecord);
+
+    // run the sink
+    sink.start();
+    sink.process();
+    sink.stop();
+
+    Assert.assertEquals(
+        Sets.newHashSet(expected),
+        read(Datasets.load(FILE_DATASET_URI)));
+    Assert.assertEquals("Should have committed", 0, remaining(in));
+  }
+
+  @Test
+  public void testIncompatibleSchemas() throws EventDeliveryException {
+    final DatasetSink sink = sink(in, config);
+
+    GenericRecordBuilder builder = new GenericRecordBuilder(
+        INCOMPATIBLE_SCHEMA);
+    GenericData.Record rec = builder.set("username", "koala").build();
+    putToChannel(in, event(rec, INCOMPATIBLE_SCHEMA, null, false));
+
+    // run the sink
+    sink.start();
+    assertThrows("Should fail", EventDeliveryException.class,
+        new Callable() {
+          @Override
+          public Object call() throws EventDeliveryException {
+            sink.process();
+            return null;
+          }
+        });
+    sink.stop();
+
+    Assert.assertEquals("Should have rolled back",
+        expected.size() + 1, remaining(in));
+  }
+
+  @Test
+  public void testMissingSchema() throws EventDeliveryException {
+    final DatasetSink sink = sink(in, config);
+
+    Event badEvent = new SimpleEvent();
+    badEvent.setHeaders(Maps.<String, String>newHashMap());
+    badEvent.setBody(serialize(expected.get(0), RECORD_SCHEMA));
+    putToChannel(in, badEvent);
+
+    // run the sink
+    sink.start();
+    assertThrows("Should fail", EventDeliveryException.class,
+        new Callable() {
+          @Override
+          public Object call() throws EventDeliveryException {
+            sink.process();
+            return null;
+          }
+        });
+    sink.stop();
+
+    Assert.assertEquals("Should have rolled back",
+        expected.size() + 1, remaining(in));
+  }
+
+  @Test
+  public void testFileStoreWithSavePolicy() throws EventDeliveryException {
+    if (Datasets.exists(ERROR_DATASET_URI)) {
+      Datasets.delete(ERROR_DATASET_URI);
+    }
+    config.put(DatasetSinkConstants.CONFIG_FAILURE_POLICY,
+        DatasetSinkConstants.SAVE_FAILURE_POLICY);
+    config.put(DatasetSinkConstants.CONFIG_KITE_ERROR_DATASET_URI,
+        ERROR_DATASET_URI);
+    DatasetSink sink = sink(in, config);
+
+    // run the sink
+    sink.start();
+    sink.process();
+    sink.stop();
+
+    Assert.assertEquals(
+        Sets.newHashSet(expected),
+        read(Datasets.load(FILE_DATASET_URI)));
+    Assert.assertEquals("Should have committed", 0, remaining(in));
+  }
+
+  @Test
+  public void testMissingSchemaWithSavePolicy() throws EventDeliveryException {
+    if (Datasets.exists(ERROR_DATASET_URI)) {
+      Datasets.delete(ERROR_DATASET_URI);
+    }
+    config.put(DatasetSinkConstants.CONFIG_FAILURE_POLICY,
+        DatasetSinkConstants.SAVE_FAILURE_POLICY);
+    config.put(DatasetSinkConstants.CONFIG_KITE_ERROR_DATASET_URI,
+        ERROR_DATASET_URI);
+    final DatasetSink sink = sink(in, config);
+
+    Event badEvent = new SimpleEvent();
+    badEvent.setHeaders(Maps.<String, String>newHashMap());
+    badEvent.setBody(serialize(expected.get(0), RECORD_SCHEMA));
+    putToChannel(in, badEvent);
+
+    // run the sink
+    sink.start();
+    sink.process();
+    sink.stop();
+
+    Assert.assertEquals("Good records should have been written",
+        Sets.newHashSet(expected),
+        read(Datasets.load(FILE_DATASET_URI)));
+    Assert.assertEquals("Should not have rolled back", 0, remaining(in));
+    Assert.assertEquals("Should have saved the bad event",
+        Sets.newHashSet(AvroFlumeEvent.newBuilder()
+          .setBody(ByteBuffer.wrap(badEvent.getBody()))
+          .setHeaders(toUtf8Map(badEvent.getHeaders()))
+          .build()),
+        read(Datasets.load(ERROR_DATASET_URI, AvroFlumeEvent.class)));
+  }
+
+  @Test
+  public void testSerializedWithIncompatibleSchemasWithSavePolicy()
+      throws EventDeliveryException {
+    if (Datasets.exists(ERROR_DATASET_URI)) {
+      Datasets.delete(ERROR_DATASET_URI);
+    }
+    config.put(DatasetSinkConstants.CONFIG_FAILURE_POLICY,
+        DatasetSinkConstants.SAVE_FAILURE_POLICY);
+    config.put(DatasetSinkConstants.CONFIG_KITE_ERROR_DATASET_URI,
+        ERROR_DATASET_URI);
+    final DatasetSink sink = sink(in, config);
+
+    GenericRecordBuilder builder = new GenericRecordBuilder(
+        INCOMPATIBLE_SCHEMA);
+    GenericData.Record rec = builder.set("username", "koala").build();
+
+    // We pass in a valid schema in the header, but an incompatible schema
+    // was used to serialize the record
+    Event badEvent = event(rec, INCOMPATIBLE_SCHEMA, SCHEMA_FILE, true);
+    putToChannel(in, badEvent);
+
+    // run the sink
+    sink.start();
+    sink.process();
+    sink.stop();
+
+    Assert.assertEquals("Good records should have been written",
+        Sets.newHashSet(expected),
+        read(Datasets.load(FILE_DATASET_URI)));
+    Assert.assertEquals("Should not have rolled back", 0, remaining(in));
+    Assert.assertEquals("Should have saved the bad event",
+        Sets.newHashSet(AvroFlumeEvent.newBuilder()
+          .setBody(ByteBuffer.wrap(badEvent.getBody()))
+          .setHeaders(toUtf8Map(badEvent.getHeaders()))
+          .build()),
+        read(Datasets.load(ERROR_DATASET_URI, AvroFlumeEvent.class)));
+  }
+
+  @Test
+  public void testSerializedWithIncompatibleSchemas() throws EventDeliveryException {
+    final DatasetSink sink = sink(in, config);
+
+    GenericRecordBuilder builder = new GenericRecordBuilder(
+        INCOMPATIBLE_SCHEMA);
+    GenericData.Record rec = builder.set("username", "koala").build();
+
+    // We pass in a valid schema in the header, but an incompatible schema
+    // was used to serialize the record
+    putToChannel(in, event(rec, INCOMPATIBLE_SCHEMA, SCHEMA_FILE, true));
+
+    // run the sink
+    sink.start();
+    assertThrows("Should fail", EventDeliveryException.class,
+        new Callable() {
+          @Override
+          public Object call() throws EventDeliveryException {
+            sink.process();
+            return null;
+          }
+        });
+    sink.stop();
+
+    Assert.assertEquals("Should have rolled back",
+        expected.size() + 1, remaining(in));
+  }
+
+  @Test
+  public void testCommitOnBatch() throws EventDeliveryException {
+    DatasetSink sink = sink(in, config);
+
+    // run the sink
+    sink.start();
+    sink.process();
+
+    // the transaction should commit during the call to process
+    Assert.assertEquals("Should have committed", 0, remaining(in));
+    // but the data won't be visible yet
+    Assert.assertEquals(0,
+        read(Datasets.load(FILE_DATASET_URI)).size());
+
+    sink.stop();
+
+    Assert.assertEquals(
+        Sets.newHashSet(expected),
+        read(Datasets.load(FILE_DATASET_URI)));
+  }
+
+  @Test
+  public void testCommitOnBatchFalse() throws EventDeliveryException {
+    config.put(DatasetSinkConstants.CONFIG_FLUSHABLE_COMMIT_ON_BATCH,
+        Boolean.toString(false));
+    config.put(DatasetSinkConstants.CONFIG_SYNCABLE_SYNC_ON_BATCH,
+        Boolean.toString(false));
+    DatasetSink sink = sink(in, config);
+
+    // run the sink
+    sink.start();
+    sink.process();
+
+    // the transaction should not commit during the call to process
+    assertThrows("Transaction should still be open", IllegalStateException.class,
+        new Callable() {
+          @Override
+          public Object call() throws EventDeliveryException {
+            in.getTransaction().begin();
+            return null;
+          }
+        });
+
+    // the data won't be visible
+    Assert.assertEquals(0,
+        read(Datasets.load(FILE_DATASET_URI)).size());
+
+    sink.stop();
+
+    Assert.assertEquals(
+        Sets.newHashSet(expected),
+        read(Datasets.load(FILE_DATASET_URI)));
+    // the transaction should commit during the call to stop
+    Assert.assertEquals("Should have committed", 0, remaining(in));
+  }
+
+  @Test
+  public void testCommitOnBatchFalseSyncOnBatchTrue() throws EventDeliveryException {
+    config.put(DatasetSinkConstants.CONFIG_FLUSHABLE_COMMIT_ON_BATCH,
+        Boolean.toString(false));
+    config.put(DatasetSinkConstants.CONFIG_SYNCABLE_SYNC_ON_BATCH,
+        Boolean.toString(true));
+
+    try {
+      sink(in, config);
+      Assert.fail("Should have thrown IllegalArgumentException");
+    } catch (IllegalArgumentException ex) {
+      // expected
+    }
+  }
+
+  @Test
+  public void testCloseAndCreateWriter() throws EventDeliveryException {
+    config.put(DatasetSinkConstants.CONFIG_FLUSHABLE_COMMIT_ON_BATCH,
+        Boolean.toString(false));
+    config.put(DatasetSinkConstants.CONFIG_SYNCABLE_SYNC_ON_BATCH,
+        Boolean.toString(false));
+    DatasetSink sink = sink(in, config);
+
+    // run the sink
+    sink.start();
+    sink.process();
+
+    sink.closeWriter();
+    sink.commitTransaction();
+    sink.createWriter();
+
+    Assert.assertNotNull("Writer should not be null", sink.getWriter());
+    Assert.assertEquals("Should have committed", 0, remaining(in));
+
+    sink.stop();
+
+    Assert.assertEquals(
+        Sets.newHashSet(expected),
+        read(Datasets.load(FILE_DATASET_URI)));
+  }
+
+  @Test
+  public void testCloseWriter() throws EventDeliveryException {
+    config.put(DatasetSinkConstants.CONFIG_FLUSHABLE_COMMIT_ON_BATCH,
+        Boolean.toString(false));
+    config.put(DatasetSinkConstants.CONFIG_SYNCABLE_SYNC_ON_BATCH,
+        Boolean.toString(false));
+    DatasetSink sink = sink(in, config);
+
+    // run the sink
+    sink.start();
+    sink.process();
+
+    sink.closeWriter();
+    sink.commitTransaction();
+
+    Assert.assertNull("Writer should be null", sink.getWriter());
+    Assert.assertEquals("Should have committed", 0, remaining(in));
+
+    sink.stop();
+
+    Assert.assertEquals(
+        Sets.newHashSet(expected),
+        read(Datasets.load(FILE_DATASET_URI)));
+  }
+
+  @Test
+  public void testCreateWriter() throws EventDeliveryException {
+    config.put(DatasetSinkConstants.CONFIG_FLUSHABLE_COMMIT_ON_BATCH,
+        Boolean.toString(false));
+    config.put(DatasetSinkConstants.CONFIG_SYNCABLE_SYNC_ON_BATCH,
+        Boolean.toString(false));
+    DatasetSink sink = sink(in, config);
+
+    // run the sink
+    sink.start();
+    sink.process();
+
+    sink.commitTransaction();
+    sink.createWriter();
+    Assert.assertNotNull("Writer should not be null", sink.getWriter());
+    Assert.assertEquals("Should have committed", 0, remaining(in));
+
+    sink.stop();
+
+    Assert.assertEquals(0, read(Datasets.load(FILE_DATASET_URI)).size());
+  }
+
+  @Test
+  public void testAppendWriteExceptionInvokesPolicy()
+      throws EventDeliveryException, NonRecoverableEventException {
+    DatasetSink sink = sink(in, config);
+
+    // run the sink
+    sink.start();
+    sink.process();
+
+    // Mock an Event
+    Event mockEvent = mock(Event.class);
+    when(mockEvent.getBody()).thenReturn(new byte[] { 0x01 });
+
+    // Mock a GenericRecord
+    GenericRecord mockRecord = mock(GenericRecord.class);
+
+    // Mock an EntityParser
+    EntityParser<GenericRecord> mockParser = mock(EntityParser.class);
+    when(mockParser.parse(eq(mockEvent), any(GenericRecord.class)))
+        .thenReturn(mockRecord);
+    sink.setParser(mockParser);
+
+    // Mock a FailurePolicy
+    FailurePolicy mockFailurePolicy = mock(FailurePolicy.class);
+    sink.setFailurePolicy(mockFailurePolicy);
+
+    // Mock a DatasetWriter
+    DatasetWriter<GenericRecord> mockWriter = mock(DatasetWriter.class);
+    doThrow(new DataFileWriter.AppendWriteException(new IOException()))
+        .when(mockWriter).write(mockRecord);
+
+    sink.setWriter(mockWriter);
+    sink.write(mockEvent);
+
+    // Verify that the event was sent to the failure policy
+    verify(mockFailurePolicy).handle(eq(mockEvent), any(Throwable.class));
+
+    sink.stop();
+  }
+
+  @Test
+  public void testRuntimeExceptionThrowsEventDeliveryException()
+      throws EventDeliveryException, NonRecoverableEventException {
+    DatasetSink sink = sink(in, config);
+
+    // run the sink
+    sink.start();
+    sink.process();
+
+    // Mock an Event
+    Event mockEvent = mock(Event.class);
+    when(mockEvent.getBody()).thenReturn(new byte[] { 0x01 });
+
+    // Mock a GenericRecord
+    GenericRecord mockRecord = mock(GenericRecord.class);
+
+    // Mock an EntityParser
+    EntityParser<GenericRecord> mockParser = mock(EntityParser.class);
+    when(mockParser.parse(eq(mockEvent), any(GenericRecord.class)))
+        .thenReturn(mockRecord);
+    sink.setParser(mockParser);
+
+    // Mock a FailurePolicy
+    FailurePolicy mockFailurePolicy = mock(FailurePolicy.class);
+    sink.setFailurePolicy(mockFailurePolicy);
+
+    // Mock a DatasetWriter
+    DatasetWriter<GenericRecord> mockWriter = mock(DatasetWriter.class);
+    doThrow(new RuntimeException()).when(mockWriter).write(mockRecord);
+
+    sink.setWriter(mockWriter);
+
+    try {
+      sink.write(mockEvent);
+      Assert.fail("Should throw EventDeliveryException");
+    } catch (EventDeliveryException ex) {
+
+    }
+
+    // Verify that the event was not sent to the failure policy
+    verify(mockFailurePolicy, never()).handle(eq(mockEvent), any(Throwable.class));
+
+    sink.stop();
+  }
+
+  @Test
+  public void testProcessHandlesNullWriter() throws EventDeliveryException,
+      NonRecoverableEventException, NonRecoverableEventException {
+    DatasetSink sink = sink(in, config);
+
+    // run the sink
+    sink.start();
+    sink.process();
+
+    // explicitly set the writer to null
+    sink.setWriter(null);
+
+    // this should not throw an NPE
+    sink.process();
+
+    sink.stop();
+
+    Assert.assertEquals("Should have committed", 0, remaining(in));
+  }
+
+  public static DatasetSink sink(Channel in, Context config) {
+    DatasetSink sink = new DatasetSink();
+    sink.setChannel(in);
+    Configurables.configure(sink, config);
+    return sink;
+  }
+
+  public static <T> HashSet<T> read(View<T> view) {
+    DatasetReader<T> reader = null;
+    try {
+      reader = view.newReader();
+      return Sets.newHashSet(reader.iterator());
+    } finally {
+      if (reader != null) {
+        reader.close();
+      }
+    }
+  }
+
+  public static int remaining(Channel ch) throws EventDeliveryException {
+    Transaction t = ch.getTransaction();
+    try {
+      t.begin();
+      int count = 0;
+      while (ch.take() != null) {
+        count += 1;
+      }
+      t.commit();
+      return count;
+    } catch (Throwable th) {
+      t.rollback();
+      Throwables.propagateIfInstanceOf(th, Error.class);
+      Throwables.propagateIfInstanceOf(th, EventDeliveryException.class);
+      throw new EventDeliveryException(th);
+    } finally {
+      t.close();
+    }
+  }
+
+  public static void putToChannel(Channel in, Event... records)
+      throws EventDeliveryException {
+    putToChannel(in, Arrays.asList(records));
+  }
+
+  public static void putToChannel(Channel in, Iterable<Event> records)
+      throws EventDeliveryException {
+    Transaction t = in.getTransaction();
+    try {
+      t.begin();
+      for (Event record : records) {
+        in.put(record);
+      }
+      t.commit();
+    } catch (Throwable th) {
+      t.rollback();
+      Throwables.propagateIfInstanceOf(th, Error.class);
+      Throwables.propagateIfInstanceOf(th, EventDeliveryException.class);
+      throw new EventDeliveryException(th);
+    } finally {
+      t.close();
+    }
+  }
+
+  public static Event event(
+      Object datum, Schema schema, File file, boolean useURI) {
+    Map<String, String> headers = Maps.newHashMap();
+    if (useURI) {
+      headers.put(DatasetSinkConstants.AVRO_SCHEMA_URL_HEADER,
+          file.getAbsoluteFile().toURI().toString());
+    } else {
+      headers.put(DatasetSinkConstants.AVRO_SCHEMA_LITERAL_HEADER,
+          schema.toString());
+    }
+    Event e = new SimpleEvent();
+    e.setBody(serialize(datum, schema));
+    e.setHeaders(headers);
+    return e;
+  }
+
+  @SuppressWarnings("unchecked")
+  public static byte[] serialize(Object datum, Schema schema) {
+    ByteArrayOutputStream out = new ByteArrayOutputStream();
+    Encoder encoder = EncoderFactory.get().binaryEncoder(out, null);
+    ReflectDatumWriter writer = new ReflectDatumWriter(schema);
+    try {
+      writer.write(datum, encoder);
+      encoder.flush();
+    } catch (IOException ex) {
+      Throwables.propagate(ex);
+    }
+    return out.toByteArray();
+  }
+
+  /**
+   * A convenience method to avoid a large number of @Test(expected=...) tests.
+   *
+   * This variant uses a Callable, which is allowed to throw checked Exceptions.
+   *
+   * @param message A String message to describe this assertion
+   * @param expected An Exception class that the Runnable should throw
+   * @param callable A Callable that is expected to throw the exception
+   */
+  public static void assertThrows(
+      String message, Class<? extends Exception> expected, Callable callable) {
+    try {
+      callable.call();
+      Assert.fail("No exception was thrown (" + message + "), expected: " +
+          expected.getName());
+    } catch (Exception actual) {
+      Assert.assertEquals(message, expected, actual.getClass());
+    }
+  }
+
+  /**
+   * Helper function to convert a map of String to a map of Utf8.
+   *
+   * @param map A Map of String to String
+   * @return The same mappings converting the {@code String}s to {@link Utf8}s
+   */
+  public static Map<CharSequence, CharSequence> toUtf8Map(
+      Map<String, String> map) {
+    Map<CharSequence, CharSequence> utf8Map = Maps.newHashMap();
+    for (Map.Entry<String, String> entry : map.entrySet()) {
+      utf8Map.put(new Utf8(entry.getKey()), new Utf8(entry.getValue()));
+    }
+    return utf8Map;
+  }
+}
diff --git a/code/flume-ng-sinks/flume-dataset-sink/src/test/resources/enable-kerberos.xml b/code/flume-ng-sinks/flume-dataset-sink/src/test/resources/enable-kerberos.xml
new file mode 100644
index 0000000..85b0447
--- /dev/null
+++ b/code/flume-ng-sinks/flume-dataset-sink/src/test/resources/enable-kerberos.xml
@@ -0,0 +1,30 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+Copyright 2014 Apache Software Foundation.
+
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+<configuration>
+
+  <property>
+    <name>hadoop.security.authentication</name>
+    <value>kerberos</value>
+  </property>
+
+  <property>
+    <name>hadoop.security.authorization</name>
+    <value>true</value>
+  </property>
+
+</configuration>
diff --git a/code/flume-ng-sinks/flume-hdfs-sink/pom.xml b/code/flume-ng-sinks/flume-hdfs-sink/pom.xml
new file mode 100644
index 0000000..bcf6556
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hdfs-sink/pom.xml
@@ -0,0 +1,196 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to You under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
+
+  <modelVersion>4.0.0</modelVersion>
+
+  <parent>
+    <artifactId>flume-ng-sinks</artifactId>
+    <groupId>org.apache.flume</groupId>
+    <version>1.7.0</version>
+  </parent>
+
+  <groupId>org.apache.flume.flume-ng-sinks</groupId>
+  <artifactId>flume-hdfs-sink</artifactId>
+  <name>Flume NG HDFS Sink</name>
+
+  <build>
+    <plugins>
+      <plugin>
+        <groupId>org.apache.rat</groupId>
+        <artifactId>apache-rat-plugin</artifactId>
+      </plugin>
+    </plugins>
+  </build>
+
+  <dependencies>
+
+    <dependency>
+      <groupId>org.apache.flume</groupId>
+      <artifactId>flume-ng-sdk</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.flume</groupId>
+      <artifactId>flume-ng-configuration</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.flume</groupId>
+      <artifactId>flume-ng-core</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>org.slf4j</groupId>
+      <artifactId>slf4j-api</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>com.google.guava</groupId>
+      <artifactId>guava</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>junit</groupId>
+      <artifactId>junit</artifactId>
+      <scope>test</scope>
+    </dependency>
+
+    <dependency>
+      <groupId>org.slf4j</groupId>
+      <artifactId>slf4j-log4j12</artifactId>
+      <scope>test</scope>
+    </dependency>
+
+    <dependency>
+      <groupId>org.mockito</groupId>
+      <artifactId>mockito-all</artifactId>
+      <scope>test</scope>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.hadoop</groupId>
+      <artifactId>${hadoop.common.artifact.id}</artifactId>
+      <optional>true</optional>
+    </dependency>
+
+    <dependency>
+      <groupId>commons-lang</groupId>
+      <artifactId>commons-lang</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>commons-io</groupId>
+      <artifactId>commons-io</artifactId>
+    </dependency>
+
+  </dependencies>
+
+  <profiles>
+
+    <profile>
+      <id>hadoop-1.0</id>
+      <activation>
+        <property>
+          <name>flume.hadoop.profile</name>
+          <value>1</value>
+        </property>
+      </activation>
+      <dependencies>
+
+        <dependency>
+          <groupId>org.apache.hadoop</groupId>
+          <artifactId>hadoop-test</artifactId>
+          <scope>test</scope>
+        </dependency>
+
+        <!-- required because the hadoop-core pom is missing these deps
+            and MiniDFSCluster pulls in the webhdfs classes -->
+        <dependency>
+          <groupId>com.sun.jersey</groupId>
+          <artifactId>jersey-core</artifactId>
+          <scope>test</scope>
+        </dependency>
+
+      </dependencies>
+    </profile>
+
+    <profile>
+      <id>hadoop-2</id>
+      <activation>
+        <property>
+          <name>flume.hadoop.profile</name>
+          <value>2</value>
+        </property>
+      </activation>
+      <dependencies>
+
+        <dependency>
+          <groupId>org.apache.hadoop</groupId>
+          <artifactId>hadoop-hdfs</artifactId>
+          <optional>true</optional>
+        </dependency>
+
+        <dependency>
+          <groupId>org.apache.hadoop</groupId>
+          <artifactId>hadoop-auth</artifactId>
+          <optional>true</optional>
+        </dependency>
+
+        <dependency>
+          <groupId>org.apache.hadoop</groupId>
+          <artifactId>hadoop-minicluster</artifactId>
+          <scope>test</scope>
+        </dependency>
+
+      </dependencies>
+    </profile>
+
+    <profile>
+      <id>hbase-1</id>
+      <activation>
+        <property>
+          <name>!flume.hadoop.profile</name>
+        </property>
+      </activation>
+      <dependencies>
+
+        <dependency>
+          <groupId>org.apache.hadoop</groupId>
+          <artifactId>hadoop-hdfs</artifactId>
+          <optional>true</optional>
+        </dependency>
+
+        <dependency>
+          <groupId>org.apache.hadoop</groupId>
+          <artifactId>hadoop-auth</artifactId>
+          <optional>true</optional>
+        </dependency>
+
+        <dependency>
+          <groupId>org.apache.hadoop</groupId>
+          <artifactId>hadoop-minicluster</artifactId>
+          <scope>test</scope>
+        </dependency>
+
+      </dependencies>
+    </profile>
+  </profiles>
+
+</project>
diff --git a/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AbstractHDFSWriter.java b/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AbstractHDFSWriter.java
new file mode 100644
index 0000000..2fe309f
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AbstractHDFSWriter.java
@@ -0,0 +1,280 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flume.sink.hdfs;
+
+import com.google.common.base.Preconditions;
+import org.apache.flume.Context;
+import org.apache.flume.FlumeException;
+import org.apache.flume.annotations.InterfaceAudience;
+import org.apache.flume.annotations.InterfaceStability;
+import org.apache.hadoop.fs.FSDataOutputStream;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.io.OutputStream;
+import java.lang.reflect.InvocationTargetException;
+import java.lang.reflect.Method;
+
+@InterfaceAudience.Private
+@InterfaceStability.Evolving
+public abstract class AbstractHDFSWriter implements HDFSWriter {
+
+  private static final Logger logger =
+      LoggerFactory.getLogger(AbstractHDFSWriter.class);
+
+  private FSDataOutputStream outputStream;
+  private FileSystem fs;
+  private Path destPath;
+  private Method refGetNumCurrentReplicas = null;
+  private Method refGetDefaultReplication = null;
+  private Method refHflushOrSync = null;
+  private Integer configuredMinReplicas = null;
+  private Integer numberOfCloseRetries = null;
+  private long timeBetweenCloseRetries = Long.MAX_VALUE;
+
+  static final Object[] NO_ARGS = new Object[]{};
+
+  @Override
+  public void configure(Context context) {
+    configuredMinReplicas = context.getInteger("hdfs.minBlockReplicas");
+    if (configuredMinReplicas != null) {
+      Preconditions.checkArgument(configuredMinReplicas >= 0,
+          "hdfs.minBlockReplicas must be greater than or equal to 0");
+    }
+    numberOfCloseRetries = context.getInteger("hdfs.closeTries", 1) - 1;
+
+    if (numberOfCloseRetries > 1) {
+      try {
+        timeBetweenCloseRetries = context.getLong("hdfs.callTimeout", 10000L);
+      } catch (NumberFormatException e) {
+        logger.warn("hdfs.callTimeout can not be parsed to a long: " +
+                    context.getLong("hdfs.callTimeout"));
+      }
+      timeBetweenCloseRetries = Math.max(timeBetweenCloseRetries / numberOfCloseRetries, 1000);
+    }
+
+  }
+
+  /**
+   * Contract for subclasses: Call registerCurrentStream() on open,
+   * unregisterCurrentStream() on close, and the base class takes care of the
+   * rest.
+   * @return
+   */
+  @Override
+  public boolean isUnderReplicated() {
+    try {
+      int numBlocks = getNumCurrentReplicas();
+      if (numBlocks == -1) {
+        return false;
+      }
+      int desiredBlocks;
+      if (configuredMinReplicas != null) {
+        desiredBlocks = configuredMinReplicas;
+      } else {
+        desiredBlocks = getFsDesiredReplication();
+      }
+      return numBlocks < desiredBlocks;
+    } catch (IllegalAccessException e) {
+      logger.error("Unexpected error while checking replication factor", e);
+    } catch (InvocationTargetException e) {
+      logger.error("Unexpected error while checking replication factor", e);
+    } catch (IllegalArgumentException e) {
+      logger.error("Unexpected error while checking replication factor", e);
+    }
+    return false;
+  }
+
+  protected void registerCurrentStream(FSDataOutputStream outputStream,
+                                      FileSystem fs, Path destPath) {
+    Preconditions.checkNotNull(outputStream, "outputStream must not be null");
+    Preconditions.checkNotNull(fs, "fs must not be null");
+    Preconditions.checkNotNull(destPath, "destPath must not be null");
+
+    this.outputStream = outputStream;
+    this.fs = fs;
+    this.destPath = destPath;
+    this.refGetNumCurrentReplicas = reflectGetNumCurrentReplicas(outputStream);
+    this.refGetDefaultReplication = reflectGetDefaultReplication(fs);
+    this.refHflushOrSync = reflectHflushOrSync(outputStream);
+
+  }
+
+  protected void unregisterCurrentStream() {
+    this.outputStream = null;
+    this.fs = null;
+    this.destPath = null;
+    this.refGetNumCurrentReplicas = null;
+    this.refGetDefaultReplication = null;
+  }
+
+  public int getFsDesiredReplication() {
+    short replication = 0;
+    if (fs != null && destPath != null) {
+      if (refGetDefaultReplication != null) {
+        try {
+          replication = (Short) refGetDefaultReplication.invoke(fs, destPath);
+        } catch (IllegalAccessException e) {
+          logger.warn("Unexpected error calling getDefaultReplication(Path)", e);
+        } catch (InvocationTargetException e) {
+          logger.warn("Unexpected error calling getDefaultReplication(Path)", e);
+        }
+      } else {
+        // will not work on Federated HDFS (see HADOOP-8014)
+        replication = fs.getDefaultReplication();
+      }
+    }
+    return replication;
+  }
+
+  /**
+   * This method gets the datanode replication count for the current open file.
+   *
+   * If the pipeline isn't started yet or is empty, you will get the default
+   * replication factor.
+   *
+   * <p/>If this function returns -1, it means you
+   * are not properly running with the HDFS-826 patch.
+   * @throws InvocationTargetException
+   * @throws IllegalAccessException
+   * @throws IllegalArgumentException
+   */
+  public int getNumCurrentReplicas()
+      throws IllegalArgumentException, IllegalAccessException,
+          InvocationTargetException {
+    if (refGetNumCurrentReplicas != null && outputStream != null) {
+      OutputStream dfsOutputStream = outputStream.getWrappedStream();
+      if (dfsOutputStream != null) {
+        Object repl = refGetNumCurrentReplicas.invoke(dfsOutputStream, NO_ARGS);
+        if (repl instanceof Integer) {
+          return ((Integer)repl).intValue();
+        }
+      }
+    }
+    return -1;
+  }
+
+  /**
+   * Find the 'getNumCurrentReplicas' on the passed <code>os</code> stream.
+   * @return Method or null.
+   */
+  private Method reflectGetNumCurrentReplicas(FSDataOutputStream os) {
+    Method m = null;
+    if (os != null) {
+      Class<? extends OutputStream> wrappedStreamClass = os.getWrappedStream()
+          .getClass();
+      try {
+        m = wrappedStreamClass.getDeclaredMethod("getNumCurrentReplicas",
+            new Class<?>[] {});
+        m.setAccessible(true);
+      } catch (NoSuchMethodException e) {
+        logger.info("FileSystem's output stream doesn't support"
+            + " getNumCurrentReplicas; --HDFS-826 not available; fsOut="
+            + wrappedStreamClass.getName() + "; err=" + e);
+      } catch (SecurityException e) {
+        logger.info("Doesn't have access to getNumCurrentReplicas on "
+            + "FileSystems's output stream --HDFS-826 not available; fsOut="
+            + wrappedStreamClass.getName(), e);
+        m = null; // could happen on setAccessible()
+      }
+    }
+    if (m != null) {
+      logger.debug("Using getNumCurrentReplicas--HDFS-826");
+    }
+    return m;
+  }
+
+  /**
+   * Find the 'getDefaultReplication' method on the passed <code>fs</code>
+   * FileSystem that takes a Path argument.
+   * @return Method or null.
+   */
+  private Method reflectGetDefaultReplication(FileSystem fileSystem) {
+    Method m = null;
+    if (fileSystem != null) {
+      Class<?> fsClass = fileSystem.getClass();
+      try {
+        m = fsClass.getMethod("getDefaultReplication",
+            new Class<?>[] { Path.class });
+      } catch (NoSuchMethodException e) {
+        logger.debug("FileSystem implementation doesn't support"
+            + " getDefaultReplication(Path); -- HADOOP-8014 not available; " +
+            "className = " + fsClass.getName() + "; err = " + e);
+      } catch (SecurityException e) {
+        logger.debug("No access to getDefaultReplication(Path) on "
+            + "FileSystem implementation -- HADOOP-8014 not available; " +
+            "className = " + fsClass.getName() + "; err = " + e);
+      }
+    }
+    if (m != null) {
+      logger.debug("Using FileSystem.getDefaultReplication(Path) from " +
+          "HADOOP-8014");
+    }
+    return m;
+  }
+
+  private Method reflectHflushOrSync(FSDataOutputStream os) {
+    Method m = null;
+    if (os != null) {
+      Class<?> fsDataOutputStreamClass = os.getClass();
+      try {
+        m = fsDataOutputStreamClass.getMethod("hflush");
+      } catch (NoSuchMethodException ex) {
+        logger.debug("HFlush not found. Will use sync() instead");
+        try {
+          m = fsDataOutputStreamClass.getMethod("sync");
+        } catch (Exception ex1) {
+          String msg = "Neither hflush not sync were found. That seems to be " +
+              "a problem!";
+          logger.error(msg);
+          throw new FlumeException(msg, ex1);
+        }
+      }
+    }
+    return m;
+  }
+
+  /**
+   * If hflush is available in this version of HDFS, then this method calls
+   * hflush, else it calls sync.
+   * @param os - The stream to flush/sync
+   * @throws IOException
+   */
+  protected void hflushOrSync(FSDataOutputStream os) throws IOException {
+    try {
+      // At this point the refHflushOrSync cannot be null,
+      // since register method would have thrown if it was.
+      this.refHflushOrSync.invoke(os);
+    } catch (InvocationTargetException e) {
+      String msg = "Error while trying to hflushOrSync!";
+      logger.error(msg);
+      Throwable cause = e.getCause();
+      if (cause != null && cause instanceof IOException) {
+        throw (IOException)cause;
+      }
+      throw new FlumeException(msg, e);
+    } catch (Exception e) {
+      String msg = "Error while trying to hflushOrSync!";
+      logger.error(msg);
+      throw new FlumeException(msg, e);
+    }
+  }
+}
diff --git a/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AvroEventSerializer.java b/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AvroEventSerializer.java
new file mode 100644
index 0000000..3231742
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AvroEventSerializer.java
@@ -0,0 +1,211 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.hdfs;
+
+import org.apache.avro.AvroRuntimeException;
+import org.apache.avro.Schema;
+import org.apache.avro.file.CodecFactory;
+import org.apache.avro.file.DataFileWriter;
+import org.apache.avro.generic.GenericDatumWriter;
+import org.apache.avro.io.DatumWriter;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.FlumeException;
+import org.apache.flume.conf.Configurable;
+import org.apache.flume.serialization.EventSerializer;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.io.InputStream;
+import java.io.OutputStream;
+import java.net.URL;
+import java.nio.ByteBuffer;
+import java.util.HashMap;
+import java.util.Locale;
+import java.util.Map;
+
+import static org.apache.flume.serialization.AvroEventSerializerConfigurationConstants.COMPRESSION_CODEC;
+import static org.apache.flume.serialization.AvroEventSerializerConfigurationConstants.DEFAULT_COMPRESSION_CODEC;
+import static org.apache.flume.serialization.AvroEventSerializerConfigurationConstants.DEFAULT_STATIC_SCHEMA_URL;
+import static org.apache.flume.serialization.AvroEventSerializerConfigurationConstants.DEFAULT_SYNC_INTERVAL_BYTES;
+import static org.apache.flume.serialization.AvroEventSerializerConfigurationConstants.STATIC_SCHEMA_URL;
+import static org.apache.flume.serialization.AvroEventSerializerConfigurationConstants.SYNC_INTERVAL_BYTES;
+
+/**
+ * <p>
+ * This class serializes Flume {@linkplain org.apache.flume.Event events} into Avro data files. The
+ * Flume event body is read as an Avro datum, and is then written to the
+ * {@link org.apache.flume.serialization.EventSerializer}'s output stream in Avro data file format.
+ * </p>
+ * <p>
+ * The Avro schema is determined by reading a Flume event header. The schema may be
+ * specified either as a literal, by setting {@link #AVRO_SCHEMA_LITERAL_HEADER} (not
+ * recommended, since the full schema must be transmitted in every event),
+ * or as a URL which the schema may be read from, by setting {@link
+ * #AVRO_SCHEMA_URL_HEADER}. Schemas read from URLs are cached by instances of this
+ * class so that the overhead of retrieval is minimized.
+ * </p>
+ */
+public class AvroEventSerializer implements EventSerializer, Configurable {
+
+  private static final Logger logger =
+      LoggerFactory.getLogger(AvroEventSerializer.class);
+
+  public static final String AVRO_SCHEMA_LITERAL_HEADER = "flume.avro.schema.literal";
+  public static final String AVRO_SCHEMA_URL_HEADER = "flume.avro.schema.url";
+
+  private final OutputStream out;
+  private DatumWriter<Object> writer = null;
+  private DataFileWriter<Object> dataFileWriter = null;
+
+  private int syncIntervalBytes;
+  private String compressionCodec;
+  private Map<String, Schema> schemaCache = new HashMap<String, Schema>();
+  private String staticSchemaURL;
+
+  private AvroEventSerializer(OutputStream out) {
+    this.out = out;
+  }
+
+  @Override
+  public void configure(Context context) {
+    syncIntervalBytes =
+        context.getInteger(SYNC_INTERVAL_BYTES, DEFAULT_SYNC_INTERVAL_BYTES);
+    compressionCodec =
+        context.getString(COMPRESSION_CODEC, DEFAULT_COMPRESSION_CODEC);
+    staticSchemaURL = context.getString(STATIC_SCHEMA_URL, DEFAULT_STATIC_SCHEMA_URL);
+  }
+
+  @Override
+  public void afterCreate() throws IOException {
+    // no-op
+  }
+
+  @Override
+  public void afterReopen() throws IOException {
+    // impossible to initialize DataFileWriter without writing the schema?
+    throw new UnsupportedOperationException("Avro API doesn't support append");
+  }
+
+  @Override
+  public void write(Event event) throws IOException {
+    if (dataFileWriter == null) {
+      initialize(event);
+    }
+    dataFileWriter.appendEncoded(ByteBuffer.wrap(event.getBody()));
+  }
+
+  private void initialize(Event event) throws IOException {
+    Schema schema = null;
+    String schemaUrl = event.getHeaders().get(AVRO_SCHEMA_URL_HEADER);
+    String schemaString = event.getHeaders().get(AVRO_SCHEMA_LITERAL_HEADER);
+
+    if (schemaUrl != null) { // if URL_HEADER is there then use it
+      schema = schemaCache.get(schemaUrl);
+      if (schema == null) {
+        schema = loadFromUrl(schemaUrl);
+        schemaCache.put(schemaUrl, schema);
+      }
+    } else if (schemaString != null) { // fallback to LITERAL_HEADER if it was there
+      schema = new Schema.Parser().parse(schemaString);
+    } else if (staticSchemaURL != null) {   // fallback to static url if it was there
+      schema = schemaCache.get(staticSchemaURL);
+      if (schema == null) {
+        schema = loadFromUrl(staticSchemaURL);
+        schemaCache.put(staticSchemaURL, schema);
+      }
+    } else { // no other options so giving up
+      throw new FlumeException("Could not find schema for event " + event);
+    }
+
+    writer = new GenericDatumWriter<Object>(schema);
+    dataFileWriter = new DataFileWriter<Object>(writer);
+
+    dataFileWriter.setSyncInterval(syncIntervalBytes);
+
+    try {
+      CodecFactory codecFactory = CodecFactory.fromString(compressionCodec);
+      dataFileWriter.setCodec(codecFactory);
+    } catch (AvroRuntimeException e) {
+      logger.warn("Unable to instantiate avro codec with name (" +
+          compressionCodec + "). Compression disabled. Exception follows.", e);
+    }
+
+    dataFileWriter.create(schema, out);
+  }
+
+  private Schema loadFromUrl(String schemaUrl) throws IOException {
+    Configuration conf = new Configuration();
+    Schema.Parser parser = new Schema.Parser();
+    if (schemaUrl.toLowerCase(Locale.ENGLISH).startsWith("hdfs://")) {
+      FileSystem fs = FileSystem.get(conf);
+      FSDataInputStream input = null;
+      try {
+        input = fs.open(new Path(schemaUrl));
+        return parser.parse(input);
+      } finally {
+        if (input != null) {
+          input.close();
+        }
+      }
+    } else {
+      InputStream is = null;
+      try {
+        is = new URL(schemaUrl).openStream();
+        return parser.parse(is);
+      } finally {
+        if (is != null) {
+          is.close();
+        }
+      }
+    }
+  }
+
+  @Override
+  public void flush() throws IOException {
+    dataFileWriter.flush();
+  }
+
+  @Override
+  public void beforeClose() throws IOException {
+    // no-op
+  }
+
+  @Override
+  public boolean supportsReopen() {
+    return false;
+  }
+
+  public static class Builder implements EventSerializer.Builder {
+
+    @Override
+    public EventSerializer build(Context context, OutputStream out) {
+      AvroEventSerializer writer = new AvroEventSerializer(out);
+      writer.configure(context);
+      return writer;
+    }
+
+  }
+
+}
diff --git a/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketClosedException.java b/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketClosedException.java
new file mode 100644
index 0000000..1d8a9e4
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketClosedException.java
@@ -0,0 +1,30 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.hdfs;
+
+import org.apache.flume.FlumeException;
+
+public class BucketClosedException extends FlumeException {
+
+  private static final long serialVersionUID = -4216667125119540357L;
+
+  public BucketClosedException(String msg) {
+    super(msg);
+  }
+}
diff --git a/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java b/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java
new file mode 100644
index 0000000..b096410
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java
@@ -0,0 +1,717 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flume.sink.hdfs;
+
+import com.google.common.annotations.VisibleForTesting;
+import com.google.common.base.Throwables;
+import org.apache.flume.Clock;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.SystemClock;
+import org.apache.flume.auth.PrivilegedExecutor;
+import org.apache.flume.instrumentation.SinkCounter;
+import org.apache.flume.sink.hdfs.HDFSEventSink.WriterCallback;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.io.SequenceFile.CompressionType;
+import org.apache.hadoop.io.compress.CompressionCodec;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.lang.reflect.Method;
+import java.security.PrivilegedExceptionAction;
+import java.util.concurrent.Callable;
+import java.util.concurrent.CancellationException;
+import java.util.concurrent.ExecutionException;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Future;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.ScheduledFuture;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.TimeoutException;
+import java.util.concurrent.atomic.AtomicInteger;
+import java.util.concurrent.atomic.AtomicLong;
+
+/**
+ * Internal API intended for HDFSSink use.
+ * This class does file rolling and handles file formats and serialization.
+ * Only the public methods in this class are thread safe.
+ */
+class BucketWriter {
+
+  private static final Logger LOG = LoggerFactory
+      .getLogger(BucketWriter.class);
+
+  /**
+   * This lock ensures that only one thread can open a file at a time.
+   */
+  private static final Integer staticLock = new Integer(1);
+  private Method isClosedMethod = null;
+
+  private HDFSWriter writer;
+  private final long rollInterval;
+  private final long rollSize;
+  private final long rollCount;
+  private final long batchSize;
+  private final CompressionCodec codeC;
+  private final CompressionType compType;
+  private final ScheduledExecutorService timedRollerPool;
+  private final PrivilegedExecutor proxyUser;
+
+  private final AtomicLong fileExtensionCounter;
+
+  private long eventCounter;
+  private long processSize;
+
+  private FileSystem fileSystem;
+
+  private volatile String filePath;
+  private volatile String fileName;
+  private volatile String inUsePrefix;
+  private volatile String inUseSuffix;
+  private volatile String fileSuffix;
+  private volatile String bucketPath;
+  private volatile String targetPath;
+  private volatile long batchCounter;
+  private volatile boolean isOpen;
+  private volatile boolean isUnderReplicated;
+  private volatile int consecutiveUnderReplRotateCount = 0;
+  private volatile ScheduledFuture<Void> timedRollFuture;
+  private SinkCounter sinkCounter;
+  private final int idleTimeout;
+  private volatile ScheduledFuture<Void> idleFuture;
+  private final WriterCallback onCloseCallback;
+  private final String onCloseCallbackPath;
+  private final long callTimeout;
+  private final ExecutorService callTimeoutPool;
+  private final int maxConsecUnderReplRotations = 30; // make this config'able?
+
+  private boolean mockFsInjected = false;
+
+  private Clock clock = new SystemClock();
+  private final long retryInterval;
+  private final int maxRenameTries;
+
+  // flag that the bucket writer was closed due to idling and thus shouldn't be
+  // reopened. Not ideal, but avoids internals of owners
+  protected boolean closed = false;
+  AtomicInteger renameTries = new AtomicInteger(0);
+
+  BucketWriter(long rollInterval, long rollSize, long rollCount, long batchSize,
+      Context context, String filePath, String fileName, String inUsePrefix,
+      String inUseSuffix, String fileSuffix, CompressionCodec codeC,
+      CompressionType compType, HDFSWriter writer,
+      ScheduledExecutorService timedRollerPool, PrivilegedExecutor proxyUser,
+      SinkCounter sinkCounter, int idleTimeout, WriterCallback onCloseCallback,
+      String onCloseCallbackPath, long callTimeout,
+      ExecutorService callTimeoutPool, long retryInterval,
+      int maxCloseTries) {
+    this.rollInterval = rollInterval;
+    this.rollSize = rollSize;
+    this.rollCount = rollCount;
+    this.batchSize = batchSize;
+    this.filePath = filePath;
+    this.fileName = fileName;
+    this.inUsePrefix = inUsePrefix;
+    this.inUseSuffix = inUseSuffix;
+    this.fileSuffix = fileSuffix;
+    this.codeC = codeC;
+    this.compType = compType;
+    this.writer = writer;
+    this.timedRollerPool = timedRollerPool;
+    this.proxyUser = proxyUser;
+    this.sinkCounter = sinkCounter;
+    this.idleTimeout = idleTimeout;
+    this.onCloseCallback = onCloseCallback;
+    this.onCloseCallbackPath = onCloseCallbackPath;
+    this.callTimeout = callTimeout;
+    this.callTimeoutPool = callTimeoutPool;
+    fileExtensionCounter = new AtomicLong(clock.currentTimeMillis());
+
+    this.retryInterval = retryInterval;
+    this.maxRenameTries = maxCloseTries;
+    isOpen = false;
+    isUnderReplicated = false;
+    this.writer.configure(context);
+  }
+
+  @VisibleForTesting
+  void setFileSystem(FileSystem fs) {
+    this.fileSystem = fs;
+    mockFsInjected = true;
+  }
+
+  @VisibleForTesting
+  void setMockStream(HDFSWriter dataWriter) {
+    this.writer = dataWriter;
+  }
+
+
+  /**
+   * Clear the class counters
+   */
+  private void resetCounters() {
+    eventCounter = 0;
+    processSize = 0;
+    batchCounter = 0;
+  }
+
+  private Method getRefIsClosed() {
+    try {
+      return fileSystem.getClass().getMethod("isFileClosed",
+        Path.class);
+    } catch (Exception e) {
+      LOG.warn("isFileClosed is not available in the " +
+          "version of HDFS being used. Flume will not " +
+          "attempt to close files if the close fails on " +
+          "the first attempt",e);
+      return null;
+    }
+  }
+
+  private Boolean isFileClosed(FileSystem fs, Path tmpFilePath) throws Exception {
+    return (Boolean)(isClosedMethod.invoke(fs, tmpFilePath));
+  }
+
+  /**
+   * open() is called by append()
+   * @throws IOException
+   * @throws InterruptedException
+   */
+  private void open() throws IOException, InterruptedException {
+    if ((filePath == null) || (writer == null)) {
+      throw new IOException("Invalid file settings");
+    }
+
+    final Configuration config = new Configuration();
+    // disable FileSystem JVM shutdown hook
+    config.setBoolean("fs.automatic.close", false);
+
+    // Hadoop is not thread safe when doing certain RPC operations,
+    // including getFileSystem(), when running under Kerberos.
+    // open() must be called by one thread at a time in the JVM.
+    // NOTE: tried synchronizing on the underlying Kerberos principal previously
+    // which caused deadlocks. See FLUME-1231.
+    synchronized (staticLock) {
+      checkAndThrowInterruptedException();
+
+      try {
+        long counter = fileExtensionCounter.incrementAndGet();
+
+        String fullFileName = fileName + "." + counter;
+
+        if (fileSuffix != null && fileSuffix.length() > 0) {
+          fullFileName += fileSuffix;
+        } else if (codeC != null) {
+          fullFileName += codeC.getDefaultExtension();
+        }
+
+        bucketPath = filePath + "/" + inUsePrefix
+          + fullFileName + inUseSuffix;
+        targetPath = filePath + "/" + fullFileName;
+
+        LOG.info("Creating " + bucketPath);
+        callWithTimeout(new CallRunner<Void>() {
+          @Override
+          public Void call() throws Exception {
+            if (codeC == null) {
+              // Need to get reference to FS using above config before underlying
+              // writer does in order to avoid shutdown hook &
+              // IllegalStateExceptions
+              if (!mockFsInjected) {
+                fileSystem = new Path(bucketPath).getFileSystem(config);
+              }
+              writer.open(bucketPath);
+            } else {
+              // need to get reference to FS before writer does to
+              // avoid shutdown hook
+              if (!mockFsInjected) {
+                fileSystem = new Path(bucketPath).getFileSystem(config);
+              }
+              writer.open(bucketPath, codeC, compType);
+            }
+            return null;
+          }
+        });
+      } catch (Exception ex) {
+        sinkCounter.incrementConnectionFailedCount();
+        if (ex instanceof IOException) {
+          throw (IOException) ex;
+        } else {
+          throw Throwables.propagate(ex);
+        }
+      }
+    }
+    isClosedMethod = getRefIsClosed();
+    sinkCounter.incrementConnectionCreatedCount();
+    resetCounters();
+
+    // if time-based rolling is enabled, schedule the roll
+    if (rollInterval > 0) {
+      Callable<Void> action = new Callable<Void>() {
+        public Void call() throws Exception {
+          LOG.debug("Rolling file ({}): Roll scheduled after {} sec elapsed.",
+              bucketPath, rollInterval);
+          try {
+            // Roll the file and remove reference from sfWriters map.
+            close(true);
+          } catch (Throwable t) {
+            LOG.error("Unexpected error", t);
+          }
+          return null;
+        }
+      };
+      timedRollFuture = timedRollerPool.schedule(action, rollInterval,
+          TimeUnit.SECONDS);
+    }
+
+    isOpen = true;
+  }
+
+  /**
+   * Close the file handle and rename the temp file to the permanent filename.
+   * Safe to call multiple times. Logs HDFSWriter.close() exceptions. This
+   * method will not cause the bucket writer to be dereferenced from the HDFS
+   * sink that owns it. This method should be used only when size or count
+   * based rolling closes this file.
+   * @throws IOException On failure to rename if temp file exists.
+   * @throws InterruptedException
+   */
+  public synchronized void close() throws IOException, InterruptedException {
+    close(false);
+  }
+
+  private CallRunner<Void> createCloseCallRunner() {
+    return new CallRunner<Void>() {
+      private final HDFSWriter localWriter = writer;
+      @Override
+      public Void call() throws Exception {
+        localWriter.close(); // could block
+        return null;
+      }
+    };
+  }
+
+  private Callable<Void> createScheduledRenameCallable() {
+
+    return new Callable<Void>() {
+      private final String path = bucketPath;
+      private final String finalPath = targetPath;
+      private FileSystem fs = fileSystem;
+      private int renameTries = 1; // one attempt is already done
+
+      @Override
+      public Void call() throws Exception {
+        if (renameTries >= maxRenameTries) {
+          LOG.warn("Unsuccessfully attempted to rename " + path + " " +
+              maxRenameTries + " times. File may still be open.");
+          return null;
+        }
+        renameTries++;
+        try {
+          renameBucket(path, finalPath, fs);
+        } catch (Exception e) {
+          LOG.warn("Renaming file: " + path + " failed. Will " +
+              "retry again in " + retryInterval + " seconds.", e);
+          timedRollerPool.schedule(this, retryInterval, TimeUnit.SECONDS);
+          return null;
+        }
+        return null;
+      }
+    };
+  }
+
+  /**
+   * Close the file handle and rename the temp file to the permanent filename.
+   * Safe to call multiple times. Logs HDFSWriter.close() exceptions.
+   * @throws IOException On failure to rename if temp file exists.
+   * @throws InterruptedException
+   */
+  public synchronized void close(boolean callCloseCallback)
+      throws IOException, InterruptedException {
+    checkAndThrowInterruptedException();
+    try {
+      flush();
+    } catch (IOException e) {
+      LOG.warn("pre-close flush failed", e);
+    }
+    boolean failedToClose = false;
+    LOG.info("Closing {}", bucketPath);
+    CallRunner<Void> closeCallRunner = createCloseCallRunner();
+    if (isOpen) {
+      try {
+        callWithTimeout(closeCallRunner);
+        sinkCounter.incrementConnectionClosedCount();
+      } catch (IOException e) {
+        LOG.warn("failed to close() HDFSWriter for file (" + bucketPath +
+                 "). Exception follows.", e);
+        sinkCounter.incrementConnectionFailedCount();
+        failedToClose = true;
+      }
+      isOpen = false;
+    } else {
+      LOG.info("HDFSWriter is already closed: {}", bucketPath);
+    }
+
+    // NOTE: timed rolls go through this codepath as well as other roll types
+    if (timedRollFuture != null && !timedRollFuture.isDone()) {
+      timedRollFuture.cancel(false); // do not cancel myself if running!
+      timedRollFuture = null;
+    }
+
+    if (idleFuture != null && !idleFuture.isDone()) {
+      idleFuture.cancel(false); // do not cancel myself if running!
+      idleFuture = null;
+    }
+
+    if (bucketPath != null && fileSystem != null) {
+      // could block or throw IOException
+      try {
+        renameBucket(bucketPath, targetPath, fileSystem);
+      } catch (Exception e) {
+        LOG.warn("failed to rename() file (" + bucketPath +
+                 "). Exception follows.", e);
+        sinkCounter.incrementConnectionFailedCount();
+        final Callable<Void> scheduledRename = createScheduledRenameCallable();
+        timedRollerPool.schedule(scheduledRename, retryInterval, TimeUnit.SECONDS);
+      }
+    }
+    if (callCloseCallback) {
+      runCloseAction();
+      closed = true;
+    }
+  }
+
+  /**
+   * flush the data
+   * @throws IOException
+   * @throws InterruptedException
+   */
+  public synchronized void flush() throws IOException, InterruptedException {
+    checkAndThrowInterruptedException();
+    if (!isBatchComplete()) {
+      doFlush();
+
+      if (idleTimeout > 0) {
+        // if the future exists and couldn't be cancelled, that would mean it has already run
+        // or been cancelled
+        if (idleFuture == null || idleFuture.cancel(false)) {
+          Callable<Void> idleAction = new Callable<Void>() {
+            public Void call() throws Exception {
+              LOG.info("Closing idle bucketWriter {} at {}", bucketPath,
+                       System.currentTimeMillis());
+              if (isOpen) {
+                close(true);
+              }
+              return null;
+            }
+          };
+          idleFuture = timedRollerPool.schedule(idleAction, idleTimeout,
+              TimeUnit.SECONDS);
+        }
+      }
+    }
+  }
+
+  private void runCloseAction() {
+    try {
+      if (onCloseCallback != null) {
+        onCloseCallback.run(onCloseCallbackPath);
+      }
+    } catch (Throwable t) {
+      LOG.error("Unexpected error", t);
+    }
+  }
+
+  /**
+   * doFlush() must only be called by flush()
+   * @throws IOException
+   */
+  private void doFlush() throws IOException, InterruptedException {
+    callWithTimeout(new CallRunner<Void>() {
+      @Override
+      public Void call() throws Exception {
+        writer.sync(); // could block
+        return null;
+      }
+    });
+    batchCounter = 0;
+  }
+
+  /**
+   * Open file handles, write data, update stats, handle file rolling and
+   * batching / flushing. <br />
+   * If the write fails, the file is implicitly closed and then the IOException
+   * is rethrown. <br />
+   * We rotate before append, and not after, so that the active file rolling
+   * mechanism will never roll an empty file. This also ensures that the file
+   * creation time reflects when the first event was written.
+   *
+   * @throws IOException
+   * @throws InterruptedException
+   */
+  public synchronized void append(final Event event)
+          throws IOException, InterruptedException {
+    checkAndThrowInterruptedException();
+    // If idleFuture is not null, cancel it before we move forward to avoid a
+    // close call in the middle of the append.
+    if (idleFuture != null) {
+      idleFuture.cancel(false);
+      // There is still a small race condition - if the idleFuture is already
+      // running, interrupting it can cause HDFS close operation to throw -
+      // so we cannot interrupt it while running. If the future could not be
+      // cancelled, it is already running - wait for it to finish before
+      // attempting to write.
+      if (!idleFuture.isDone()) {
+        try {
+          idleFuture.get(callTimeout, TimeUnit.MILLISECONDS);
+        } catch (TimeoutException ex) {
+          LOG.warn("Timeout while trying to cancel closing of idle file. Idle" +
+                   " file close may have failed", ex);
+        } catch (Exception ex) {
+          LOG.warn("Error while trying to cancel closing of idle file. ", ex);
+        }
+      }
+      idleFuture = null;
+    }
+
+    // If the bucket writer was closed due to roll timeout or idle timeout,
+    // force a new bucket writer to be created. Roll count and roll size will
+    // just reuse this one
+    if (!isOpen) {
+      if (closed) {
+        throw new BucketClosedException("This bucket writer was closed and " +
+          "this handle is thus no longer valid");
+      }
+      open();
+    }
+
+    // check if it's time to rotate the file
+    if (shouldRotate()) {
+      boolean doRotate = true;
+
+      if (isUnderReplicated) {
+        if (maxConsecUnderReplRotations > 0 &&
+            consecutiveUnderReplRotateCount >= maxConsecUnderReplRotations) {
+          doRotate = false;
+          if (consecutiveUnderReplRotateCount == maxConsecUnderReplRotations) {
+            LOG.error("Hit max consecutive under-replication rotations ({}); " +
+                "will not continue rolling files under this path due to " +
+                "under-replication", maxConsecUnderReplRotations);
+          }
+        } else {
+          LOG.warn("Block Under-replication detected. Rotating file.");
+        }
+        consecutiveUnderReplRotateCount++;
+      } else {
+        consecutiveUnderReplRotateCount = 0;
+      }
+
+      if (doRotate) {
+        close();
+        open();
+      }
+    }
+
+    // write the event
+    try {
+      sinkCounter.incrementEventDrainAttemptCount();
+      callWithTimeout(new CallRunner<Void>() {
+        @Override
+        public Void call() throws Exception {
+          writer.append(event); // could block
+          return null;
+        }
+      });
+    } catch (IOException e) {
+      LOG.warn("Caught IOException writing to HDFSWriter ({}). Closing file (" +
+          bucketPath + ") and rethrowing exception.",
+          e.getMessage());
+      try {
+        close(true);
+      } catch (IOException e2) {
+        LOG.warn("Caught IOException while closing file (" +
+             bucketPath + "). Exception follows.", e2);
+      }
+      throw e;
+    }
+
+    // update statistics
+    processSize += event.getBody().length;
+    eventCounter++;
+    batchCounter++;
+
+    if (batchCounter == batchSize) {
+      flush();
+    }
+  }
+
+  /**
+   * check if time to rotate the file
+   */
+  private boolean shouldRotate() {
+    boolean doRotate = false;
+
+    if (writer.isUnderReplicated()) {
+      this.isUnderReplicated = true;
+      doRotate = true;
+    } else {
+      this.isUnderReplicated = false;
+    }
+
+    if ((rollCount > 0) && (rollCount <= eventCounter)) {
+      LOG.debug("rolling: rollCount: {}, events: {}", rollCount, eventCounter);
+      doRotate = true;
+    }
+
+    if ((rollSize > 0) && (rollSize <= processSize)) {
+      LOG.debug("rolling: rollSize: {}, bytes: {}", rollSize, processSize);
+      doRotate = true;
+    }
+
+    return doRotate;
+  }
+
+  /**
+   * Rename bucketPath file from .tmp to permanent location.
+   */
+  // When this bucket writer is rolled based on rollCount or
+  // rollSize, the same instance is reused for the new file. But if
+  // the previous file was not closed/renamed,
+  // the bucket writer fields no longer point to it and hence need
+  // to be passed in from the thread attempting to close it. Even
+  // when the bucket writer is closed due to close timeout,
+  // this method can get called from the scheduled thread so the
+  // file gets closed later - so an implicit reference to this
+  // bucket writer would still be alive in the Callable instance.
+  private void renameBucket(String bucketPath, String targetPath, final FileSystem fs)
+      throws IOException, InterruptedException {
+    if (bucketPath.equals(targetPath)) {
+      return;
+    }
+
+    final Path srcPath = new Path(bucketPath);
+    final Path dstPath = new Path(targetPath);
+
+    callWithTimeout(new CallRunner<Void>() {
+      @Override
+      public Void call() throws Exception {
+        if (fs.exists(srcPath)) { // could block
+          LOG.info("Renaming " + srcPath + " to " + dstPath);
+          renameTries.incrementAndGet();
+          fs.rename(srcPath, dstPath); // could block
+        }
+        return null;
+      }
+    });
+  }
+
+  @Override
+  public String toString() {
+    return "[ " + this.getClass().getSimpleName() + " targetPath = " + targetPath +
+        ", bucketPath = " + bucketPath + " ]";
+  }
+
+  private boolean isBatchComplete() {
+    return (batchCounter == 0);
+  }
+
+  void setClock(Clock clock) {
+    this.clock = clock;
+  }
+
+  /**
+   * This method if the current thread has been interrupted and throws an
+   * exception.
+   * @throws InterruptedException
+   */
+  private static void checkAndThrowInterruptedException()
+          throws InterruptedException {
+    if (Thread.currentThread().interrupted()) {
+      throw new InterruptedException("Timed out before HDFS call was made. "
+              + "Your hdfs.callTimeout might be set too low or HDFS calls are "
+              + "taking too long.");
+    }
+  }
+
+  /**
+   * Execute the callable on a separate thread and wait for the completion
+   * for the specified amount of time in milliseconds. In case of timeout
+   * cancel the callable and throw an IOException
+   */
+  private <T> T callWithTimeout(final CallRunner<T> callRunner)
+      throws IOException, InterruptedException {
+    Future<T> future = callTimeoutPool.submit(new Callable<T>() {
+      @Override
+      public T call() throws Exception {
+        return proxyUser.execute(new PrivilegedExceptionAction<T>() {
+          @Override
+          public T run() throws Exception {
+            return callRunner.call();
+          }
+        });
+      }
+    });
+    try {
+      if (callTimeout > 0) {
+        return future.get(callTimeout, TimeUnit.MILLISECONDS);
+      } else {
+        return future.get();
+      }
+    } catch (TimeoutException eT) {
+      future.cancel(true);
+      sinkCounter.incrementConnectionFailedCount();
+      throw new IOException("Callable timed out after " +
+        callTimeout + " ms" + " on file: " + bucketPath, eT);
+    } catch (ExecutionException e1) {
+      sinkCounter.incrementConnectionFailedCount();
+      Throwable cause = e1.getCause();
+      if (cause instanceof IOException) {
+        throw (IOException) cause;
+      } else if (cause instanceof InterruptedException) {
+        throw (InterruptedException) cause;
+      } else if (cause instanceof RuntimeException) {
+        throw (RuntimeException) cause;
+      } else if (cause instanceof Error) {
+        throw (Error)cause;
+      } else {
+        throw new RuntimeException(e1);
+      }
+    } catch (CancellationException ce) {
+      throw new InterruptedException(
+        "Blocked callable interrupted by rotation event");
+    } catch (InterruptedException ex) {
+      LOG.warn("Unexpected Exception " + ex.getMessage(), ex);
+      throw ex;
+    }
+  }
+
+  /**
+   * Simple interface whose <tt>call</tt> method is called by
+   * {#callWithTimeout} in a new thread inside a
+   * {@linkplain java.security.PrivilegedExceptionAction#run()} call.
+   * @param <T>
+   */
+  private interface CallRunner<T> {
+    T call() throws Exception;
+  }
+
+}
diff --git a/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java b/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java
new file mode 100644
index 0000000..80b7cb4
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java
@@ -0,0 +1,162 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flume.sink.hdfs;
+
+import java.io.IOException;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.serialization.EventSerializer;
+import org.apache.flume.serialization.EventSerializerFactory;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FSDataOutputStream;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.LocalFileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.io.SequenceFile.CompressionType;
+import org.apache.hadoop.io.compress.CodecPool;
+import org.apache.hadoop.io.compress.CompressionCodec;
+import org.apache.hadoop.io.compress.CompressionOutputStream;
+import org.apache.hadoop.io.compress.Compressor;
+import org.apache.hadoop.io.compress.DefaultCodec;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class HDFSCompressedDataStream extends AbstractHDFSWriter {
+
+  private static final Logger logger =
+      LoggerFactory.getLogger(HDFSCompressedDataStream.class);
+
+  private FSDataOutputStream fsOut;
+  private CompressionOutputStream cmpOut;
+  private boolean isFinished = false;
+
+  private String serializerType;
+  private Context serializerContext;
+  private EventSerializer serializer;
+  private boolean useRawLocalFileSystem;
+  private Compressor compressor;
+
+  @Override
+  public void configure(Context context) {
+    super.configure(context);
+
+    serializerType = context.getString("serializer", "TEXT");
+    useRawLocalFileSystem = context.getBoolean("hdfs.useRawLocalFileSystem",
+        false);
+    serializerContext = new Context(
+        context.getSubProperties(EventSerializer.CTX_PREFIX));
+    logger.info("Serializer = " + serializerType + ", UseRawLocalFileSystem = "
+        + useRawLocalFileSystem);
+  }
+
+  @Override
+  public void open(String filePath) throws IOException {
+    DefaultCodec defCodec = new DefaultCodec();
+    CompressionType cType = CompressionType.BLOCK;
+    open(filePath, defCodec, cType);
+  }
+
+  @Override
+  public void open(String filePath, CompressionCodec codec,
+      CompressionType cType) throws IOException {
+    Configuration conf = new Configuration();
+    Path dstPath = new Path(filePath);
+    FileSystem hdfs = dstPath.getFileSystem(conf);
+    if (useRawLocalFileSystem) {
+      if (hdfs instanceof LocalFileSystem) {
+        hdfs = ((LocalFileSystem)hdfs).getRaw();
+      } else {
+        logger.warn("useRawLocalFileSystem is set to true but file system " +
+            "is not of type LocalFileSystem: " + hdfs.getClass().getName());
+      }
+    }
+    boolean appending = false;
+    if (conf.getBoolean("hdfs.append.support", false) == true && hdfs.isFile(dstPath)) {
+      fsOut = hdfs.append(dstPath);
+      appending = true;
+    } else {
+      fsOut = hdfs.create(dstPath);
+    }
+    if (compressor == null) {
+      compressor = CodecPool.getCompressor(codec, conf);
+    }
+    cmpOut = codec.createOutputStream(fsOut, compressor);
+    serializer = EventSerializerFactory.getInstance(serializerType,
+        serializerContext, cmpOut);
+    if (appending && !serializer.supportsReopen()) {
+      cmpOut.close();
+      serializer = null;
+      throw new IOException("serializer (" + serializerType
+          + ") does not support append");
+    }
+
+    registerCurrentStream(fsOut, hdfs, dstPath);
+
+    if (appending) {
+      serializer.afterReopen();
+    } else {
+      serializer.afterCreate();
+    }
+    isFinished = false;
+  }
+
+  @Override
+  public void append(Event e) throws IOException {
+    if (isFinished) {
+      cmpOut.resetState();
+      isFinished = false;
+    }
+    serializer.write(e);
+  }
+
+  @Override
+  public void sync() throws IOException {
+    // We must use finish() and resetState() here -- flush() is apparently not
+    // supported by the compressed output streams (it's a no-op).
+    // Also, since resetState() writes headers, avoid calling it without an
+    // additional write/append operation.
+    // Note: There are bugs in Hadoop & JDK w/ pure-java gzip; see HADOOP-8522.
+    serializer.flush();
+    if (!isFinished) {
+      cmpOut.finish();
+      isFinished = true;
+    }
+    fsOut.flush();
+    hflushOrSync(this.fsOut);
+  }
+
+  @Override
+  public void close() throws IOException {
+    serializer.flush();
+    serializer.beforeClose();
+    if (!isFinished) {
+      cmpOut.finish();
+      isFinished = true;
+    }
+    fsOut.flush();
+    hflushOrSync(fsOut);
+    cmpOut.close();
+    if (compressor != null) {
+      CodecPool.returnCompressor(compressor);
+      compressor = null;
+    }
+    unregisterCurrentStream();
+  }
+
+}
diff --git a/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSDataStream.java b/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSDataStream.java
new file mode 100644
index 0000000..c4ad919
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSDataStream.java
@@ -0,0 +1,140 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flume.sink.hdfs;
+
+import java.io.IOException;
+
+import com.google.common.annotations.VisibleForTesting;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.serialization.EventSerializer;
+import org.apache.flume.serialization.EventSerializerFactory;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FSDataOutputStream;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.LocalFileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.io.SequenceFile.CompressionType;
+import org.apache.hadoop.io.compress.CompressionCodec;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class HDFSDataStream extends AbstractHDFSWriter {
+
+  private static final Logger logger = LoggerFactory.getLogger(HDFSDataStream.class);
+
+  private FSDataOutputStream outStream;
+  private String serializerType;
+  private Context serializerContext;
+  private EventSerializer serializer;
+  private boolean useRawLocalFileSystem;
+
+  @Override
+  public void configure(Context context) {
+    super.configure(context);
+
+    serializerType = context.getString("serializer", "TEXT");
+    useRawLocalFileSystem = context.getBoolean("hdfs.useRawLocalFileSystem",
+        false);
+    serializerContext =
+        new Context(context.getSubProperties(EventSerializer.CTX_PREFIX));
+    logger.info("Serializer = " + serializerType + ", UseRawLocalFileSystem = "
+        + useRawLocalFileSystem);
+  }
+
+  @VisibleForTesting
+  protected FileSystem getDfs(Configuration conf, Path dstPath) throws IOException {
+    return dstPath.getFileSystem(conf);
+  }
+
+  protected void doOpen(Configuration conf, Path dstPath, FileSystem hdfs) throws IOException {
+    if (useRawLocalFileSystem) {
+      if (hdfs instanceof LocalFileSystem) {
+        hdfs = ((LocalFileSystem)hdfs).getRaw();
+      } else {
+        logger.warn("useRawLocalFileSystem is set to true but file system " +
+            "is not of type LocalFileSystem: " + hdfs.getClass().getName());
+      }
+    }
+
+    boolean appending = false;
+    if (conf.getBoolean("hdfs.append.support", false) == true && hdfs.isFile(dstPath)) {
+      outStream = hdfs.append(dstPath);
+      appending = true;
+    } else {
+      outStream = hdfs.create(dstPath);
+    }
+
+    serializer = EventSerializerFactory.getInstance(
+        serializerType, serializerContext, outStream);
+    if (appending && !serializer.supportsReopen()) {
+      outStream.close();
+      serializer = null;
+      throw new IOException("serializer (" + serializerType +
+          ") does not support append");
+    }
+
+    // must call superclass to check for replication issues
+    registerCurrentStream(outStream, hdfs, dstPath);
+
+    if (appending) {
+      serializer.afterReopen();
+    } else {
+      serializer.afterCreate();
+    }
+  }
+
+  @Override
+  public void open(String filePath) throws IOException {
+    Configuration conf = new Configuration();
+    Path dstPath = new Path(filePath);
+    FileSystem hdfs = getDfs(conf, dstPath);
+    doOpen(conf, dstPath, hdfs);
+  }
+
+  @Override
+  public void open(String filePath, CompressionCodec codec,
+                   CompressionType cType) throws IOException {
+    open(filePath);
+  }
+
+  @Override
+  public void append(Event e) throws IOException {
+    serializer.write(e);
+  }
+
+  @Override
+  public void sync() throws IOException {
+    serializer.flush();
+    outStream.flush();
+    hflushOrSync(outStream);
+  }
+
+  @Override
+  public void close() throws IOException {
+    serializer.flush();
+    serializer.beforeClose();
+    outStream.flush();
+    hflushOrSync(outStream);
+    outStream.close();
+
+    unregisterCurrentStream();
+  }
+
+}
diff --git a/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java b/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java
new file mode 100644
index 0000000..741f01e
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java
@@ -0,0 +1,559 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flume.sink.hdfs;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Calendar;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.TimeZone;
+import java.util.Map.Entry;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.TimeUnit;
+
+import com.google.common.annotations.VisibleForTesting;
+import org.apache.flume.Channel;
+import org.apache.flume.Clock;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.EventDeliveryException;
+import org.apache.flume.SystemClock;
+import org.apache.flume.Transaction;
+import org.apache.flume.auth.FlumeAuthenticationUtil;
+import org.apache.flume.auth.FlumeAuthenticator;
+import org.apache.flume.auth.PrivilegedExecutor;
+import org.apache.flume.conf.Configurable;
+import org.apache.flume.formatter.output.BucketPath;
+import org.apache.flume.instrumentation.SinkCounter;
+import org.apache.flume.sink.AbstractSink;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.io.SequenceFile.CompressionType;
+import org.apache.hadoop.io.compress.CompressionCodec;
+import org.apache.hadoop.io.compress.CompressionCodecFactory;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import com.google.common.base.Preconditions;
+import com.google.common.collect.Lists;
+import com.google.common.util.concurrent.ThreadFactoryBuilder;
+
+public class HDFSEventSink extends AbstractSink implements Configurable {
+  public interface WriterCallback {
+    public void run(String filePath);
+  }
+
+  private static final Logger LOG = LoggerFactory.getLogger(HDFSEventSink.class);
+
+  private static String DIRECTORY_DELIMITER = System.getProperty("file.separator");
+
+  private static final long defaultRollInterval = 30;
+  private static final long defaultRollSize = 1024;
+  private static final long defaultRollCount = 10;
+  private static final String defaultFileName = "FlumeData";
+  private static final String defaultSuffix = "";
+  private static final String defaultInUsePrefix = "";
+  private static final String defaultInUseSuffix = ".tmp";
+  private static final long defaultBatchSize = 100;
+  private static final String defaultFileType = HDFSWriterFactory.SequenceFileType;
+  private static final int defaultMaxOpenFiles = 5000;
+  // Time between close retries, in seconds
+  private static final long defaultRetryInterval = 180;
+  // Retry forever.
+  private static final int defaultTryCount = Integer.MAX_VALUE;
+
+  /**
+   * Default length of time we wait for blocking BucketWriter calls
+   * before timing out the operation. Intended to prevent server hangs.
+   */
+  private static final long defaultCallTimeout = 10000;
+  /**
+   * Default number of threads available for tasks
+   * such as append/open/close/flush with hdfs.
+   * These tasks are done in a separate thread in
+   * the case that they take too long. In which
+   * case we create a new file and move on.
+   */
+  private static final int defaultThreadPoolSize = 10;
+  private static final int defaultRollTimerPoolSize = 1;
+
+  private final HDFSWriterFactory writerFactory;
+  private WriterLinkedHashMap sfWriters;
+
+  private long rollInterval;
+  private long rollSize;
+  private long rollCount;
+  private long batchSize;
+  private int threadsPoolSize;
+  private int rollTimerPoolSize;
+  private CompressionCodec codeC;
+  private CompressionType compType;
+  private String fileType;
+  private String filePath;
+  private String fileName;
+  private String suffix;
+  private String inUsePrefix;
+  private String inUseSuffix;
+  private TimeZone timeZone;
+  private int maxOpenFiles;
+  private ExecutorService callTimeoutPool;
+  private ScheduledExecutorService timedRollerPool;
+
+  private boolean needRounding = false;
+  private int roundUnit = Calendar.SECOND;
+  private int roundValue = 1;
+  private boolean useLocalTime = false;
+
+  private long callTimeout;
+  private Context context;
+  private SinkCounter sinkCounter;
+
+  private volatile int idleTimeout;
+  private Clock clock;
+  private FileSystem mockFs;
+  private HDFSWriter mockWriter;
+  private final Object sfWritersLock = new Object();
+  private long retryInterval;
+  private int tryCount;
+  private PrivilegedExecutor privExecutor;
+
+
+  /*
+   * Extended Java LinkedHashMap for open file handle LRU queue.
+   * We want to clear the oldest file handle if there are too many open ones.
+   */
+  private static class WriterLinkedHashMap
+      extends LinkedHashMap<String, BucketWriter> {
+
+    private final int maxOpenFiles;
+
+    public WriterLinkedHashMap(int maxOpenFiles) {
+      super(16, 0.75f, true); // stock initial capacity/load, access ordering
+      this.maxOpenFiles = maxOpenFiles;
+    }
+
+    @Override
+    protected boolean removeEldestEntry(Entry<String, BucketWriter> eldest) {
+      if (size() > maxOpenFiles) {
+        // If we have more that max open files, then close the last one and
+        // return true
+        try {
+          eldest.getValue().close();
+        } catch (IOException e) {
+          LOG.warn(eldest.getKey().toString(), e);
+        } catch (InterruptedException e) {
+          LOG.warn(eldest.getKey().toString(), e);
+          Thread.currentThread().interrupt();
+        }
+        return true;
+      } else {
+        return false;
+      }
+    }
+  }
+
+  public HDFSEventSink() {
+    this(new HDFSWriterFactory());
+  }
+
+  public HDFSEventSink(HDFSWriterFactory writerFactory) {
+    this.writerFactory = writerFactory;
+  }
+
+  @VisibleForTesting
+  Map<String, BucketWriter> getSfWriters() {
+    return sfWriters;
+  }
+
+  // read configuration and setup thresholds
+  @Override
+  public void configure(Context context) {
+    this.context = context;
+
+    filePath = Preconditions.checkNotNull(
+        context.getString("hdfs.path"), "hdfs.path is required");
+    fileName = context.getString("hdfs.filePrefix", defaultFileName);
+    this.suffix = context.getString("hdfs.fileSuffix", defaultSuffix);
+    inUsePrefix = context.getString("hdfs.inUsePrefix", defaultInUsePrefix);
+    inUseSuffix = context.getString("hdfs.inUseSuffix", defaultInUseSuffix);
+    String tzName = context.getString("hdfs.timeZone");
+    timeZone = tzName == null ? null : TimeZone.getTimeZone(tzName);
+    rollInterval = context.getLong("hdfs.rollInterval", defaultRollInterval);
+    rollSize = context.getLong("hdfs.rollSize", defaultRollSize);
+    rollCount = context.getLong("hdfs.rollCount", defaultRollCount);
+    batchSize = context.getLong("hdfs.batchSize", defaultBatchSize);
+    idleTimeout = context.getInteger("hdfs.idleTimeout", 0);
+    String codecName = context.getString("hdfs.codeC");
+    fileType = context.getString("hdfs.fileType", defaultFileType);
+    maxOpenFiles = context.getInteger("hdfs.maxOpenFiles", defaultMaxOpenFiles);
+    callTimeout = context.getLong("hdfs.callTimeout", defaultCallTimeout);
+    threadsPoolSize = context.getInteger("hdfs.threadsPoolSize",
+        defaultThreadPoolSize);
+    rollTimerPoolSize = context.getInteger("hdfs.rollTimerPoolSize",
+        defaultRollTimerPoolSize);
+    String kerbConfPrincipal = context.getString("hdfs.kerberosPrincipal");
+    String kerbKeytab = context.getString("hdfs.kerberosKeytab");
+    String proxyUser = context.getString("hdfs.proxyUser");
+    tryCount = context.getInteger("hdfs.closeTries", defaultTryCount);
+    if (tryCount <= 0) {
+      LOG.warn("Retry count value : " + tryCount + " is not " +
+          "valid. The sink will try to close the file until the file " +
+          "is eventually closed.");
+      tryCount = defaultTryCount;
+    }
+    retryInterval = context.getLong("hdfs.retryInterval", defaultRetryInterval);
+    if (retryInterval <= 0) {
+      LOG.warn("Retry Interval value: " + retryInterval + " is not " +
+          "valid. If the first close of a file fails, " +
+          "it may remain open and will not be renamed.");
+      tryCount = 1;
+    }
+
+    Preconditions.checkArgument(batchSize > 0, "batchSize must be greater than 0");
+    if (codecName == null) {
+      codeC = null;
+      compType = CompressionType.NONE;
+    } else {
+      codeC = getCodec(codecName);
+      // TODO : set proper compression type
+      compType = CompressionType.BLOCK;
+    }
+
+    // Do not allow user to set fileType DataStream with codeC together
+    // To prevent output file with compress extension (like .snappy)
+    if (fileType.equalsIgnoreCase(HDFSWriterFactory.DataStreamType) && codecName != null) {
+      throw new IllegalArgumentException("fileType: " + fileType +
+          " which does NOT support compressed output. Please don't set codeC" +
+          " or change the fileType if compressed output is desired.");
+    }
+
+    if (fileType.equalsIgnoreCase(HDFSWriterFactory.CompStreamType)) {
+      Preconditions.checkNotNull(codeC, "It's essential to set compress codec"
+          + " when fileType is: " + fileType);
+    }
+
+    // get the appropriate executor
+    this.privExecutor = FlumeAuthenticationUtil.getAuthenticator(
+            kerbConfPrincipal, kerbKeytab).proxyAs(proxyUser);
+
+    needRounding = context.getBoolean("hdfs.round", false);
+
+    if (needRounding) {
+      String unit = context.getString("hdfs.roundUnit", "second");
+      if (unit.equalsIgnoreCase("hour")) {
+        this.roundUnit = Calendar.HOUR_OF_DAY;
+      } else if (unit.equalsIgnoreCase("minute")) {
+        this.roundUnit = Calendar.MINUTE;
+      } else if (unit.equalsIgnoreCase("second")) {
+        this.roundUnit = Calendar.SECOND;
+      } else {
+        LOG.warn("Rounding unit is not valid, please set one of" +
+            "minute, hour, or second. Rounding will be disabled");
+        needRounding = false;
+      }
+      this.roundValue = context.getInteger("hdfs.roundValue", 1);
+      if (roundUnit == Calendar.SECOND || roundUnit == Calendar.MINUTE) {
+        Preconditions.checkArgument(roundValue > 0 && roundValue <= 60,
+            "Round value" +
+            "must be > 0 and <= 60");
+      } else if (roundUnit == Calendar.HOUR_OF_DAY) {
+        Preconditions.checkArgument(roundValue > 0 && roundValue <= 24,
+            "Round value" +
+            "must be > 0 and <= 24");
+      }
+    }
+
+    this.useLocalTime = context.getBoolean("hdfs.useLocalTimeStamp", false);
+    if (useLocalTime) {
+      clock = new SystemClock();
+    }
+
+    if (sinkCounter == null) {
+      sinkCounter = new SinkCounter(getName());
+    }
+  }
+
+  private static boolean codecMatches(Class<? extends CompressionCodec> cls, String codecName) {
+    String simpleName = cls.getSimpleName();
+    if (cls.getName().equals(codecName) || simpleName.equalsIgnoreCase(codecName)) {
+      return true;
+    }
+    if (simpleName.endsWith("Codec")) {
+      String prefix = simpleName.substring(0, simpleName.length() - "Codec".length());
+      if (prefix.equalsIgnoreCase(codecName)) {
+        return true;
+      }
+    }
+    return false;
+  }
+
+  @VisibleForTesting
+  static CompressionCodec getCodec(String codecName) {
+    Configuration conf = new Configuration();
+    List<Class<? extends CompressionCodec>> codecs = CompressionCodecFactory.getCodecClasses(conf);
+    // Wish we could base this on DefaultCodec but appears not all codec's
+    // extend DefaultCodec(Lzo)
+    CompressionCodec codec = null;
+    ArrayList<String> codecStrs = new ArrayList<String>();
+    codecStrs.add("None");
+    for (Class<? extends CompressionCodec> cls : codecs) {
+      codecStrs.add(cls.getSimpleName());
+      if (codecMatches(cls, codecName)) {
+        try {
+          codec = cls.newInstance();
+        } catch (InstantiationException e) {
+          LOG.error("Unable to instantiate " + cls + " class");
+        } catch (IllegalAccessException e) {
+          LOG.error("Unable to access " + cls + " class");
+        }
+      }
+    }
+
+    if (codec == null) {
+      if (!codecName.equalsIgnoreCase("None")) {
+        throw new IllegalArgumentException("Unsupported compression codec "
+            + codecName + ".  Please choose from: " + codecStrs);
+      }
+    } else if (codec instanceof org.apache.hadoop.conf.Configurable) {
+      // Must check instanceof codec as BZip2Codec doesn't inherit Configurable
+      // Must set the configuration for Configurable objects that may or do use
+      // native libs
+      ((org.apache.hadoop.conf.Configurable) codec).setConf(conf);
+    }
+    return codec;
+  }
+
+
+  /**
+   * Pull events out of channel and send it to HDFS. Take at most batchSize
+   * events per Transaction. Find the corresponding bucket for the event.
+   * Ensure the file is open. Serialize the data and write it to the file on
+   * HDFS. <br/>
+   * This method is not thread safe.
+   */
+  public Status process() throws EventDeliveryException {
+    Channel channel = getChannel();
+    Transaction transaction = channel.getTransaction();
+    List<BucketWriter> writers = Lists.newArrayList();
+    transaction.begin();
+    try {
+      int txnEventCount = 0;
+      for (txnEventCount = 0; txnEventCount < batchSize; txnEventCount++) {
+        Event event = channel.take();
+        if (event == null) {
+          break;
+        }
+
+        // reconstruct the path name by substituting place holders
+        String realPath = BucketPath.escapeString(filePath, event.getHeaders(),
+            timeZone, needRounding, roundUnit, roundValue, useLocalTime);
+        String realName = BucketPath.escapeString(fileName, event.getHeaders(),
+            timeZone, needRounding, roundUnit, roundValue, useLocalTime);
+
+        String lookupPath = realPath + DIRECTORY_DELIMITER + realName;
+        BucketWriter bucketWriter;
+        HDFSWriter hdfsWriter = null;
+        // Callback to remove the reference to the bucket writer from the
+        // sfWriters map so that all buffers used by the HDFS file
+        // handles are garbage collected.
+        WriterCallback closeCallback = new WriterCallback() {
+          @Override
+          public void run(String bucketPath) {
+            LOG.info("Writer callback called.");
+            synchronized (sfWritersLock) {
+              sfWriters.remove(bucketPath);
+            }
+          }
+        };
+        synchronized (sfWritersLock) {
+          bucketWriter = sfWriters.get(lookupPath);
+          // we haven't seen this file yet, so open it and cache the handle
+          if (bucketWriter == null) {
+            hdfsWriter = writerFactory.getWriter(fileType);
+            bucketWriter = initializeBucketWriter(realPath, realName,
+              lookupPath, hdfsWriter, closeCallback);
+            sfWriters.put(lookupPath, bucketWriter);
+          }
+        }
+
+        // track the buckets getting written in this transaction
+        if (!writers.contains(bucketWriter)) {
+          writers.add(bucketWriter);
+        }
+
+        // Write the data to HDFS
+        try {
+          bucketWriter.append(event);
+        } catch (BucketClosedException ex) {
+          LOG.info("Bucket was closed while trying to append, " +
+                   "reinitializing bucket and writing event.");
+          hdfsWriter = writerFactory.getWriter(fileType);
+          bucketWriter = initializeBucketWriter(realPath, realName,
+            lookupPath, hdfsWriter, closeCallback);
+          synchronized (sfWritersLock) {
+            sfWriters.put(lookupPath, bucketWriter);
+          }
+          bucketWriter.append(event);
+        }
+      }
+
+      if (txnEventCount == 0) {
+        sinkCounter.incrementBatchEmptyCount();
+      } else if (txnEventCount == batchSize) {
+        sinkCounter.incrementBatchCompleteCount();
+      } else {
+        sinkCounter.incrementBatchUnderflowCount();
+      }
+
+      // flush all pending buckets before committing the transaction
+      for (BucketWriter bucketWriter : writers) {
+        bucketWriter.flush();
+      }
+
+      transaction.commit();
+
+      if (txnEventCount < 1) {
+        return Status.BACKOFF;
+      } else {
+        sinkCounter.addToEventDrainSuccessCount(txnEventCount);
+        return Status.READY;
+      }
+    } catch (IOException eIO) {
+      transaction.rollback();
+      LOG.warn("HDFS IO error", eIO);
+      return Status.BACKOFF;
+    } catch (Throwable th) {
+      transaction.rollback();
+      LOG.error("process failed", th);
+      if (th instanceof Error) {
+        throw (Error) th;
+      } else {
+        throw new EventDeliveryException(th);
+      }
+    } finally {
+      transaction.close();
+    }
+  }
+
+  private BucketWriter initializeBucketWriter(String realPath,
+      String realName, String lookupPath, HDFSWriter hdfsWriter,
+      WriterCallback closeCallback) {
+    BucketWriter bucketWriter = new BucketWriter(rollInterval,
+        rollSize, rollCount,
+        batchSize, context, realPath, realName, inUsePrefix, inUseSuffix,
+        suffix, codeC, compType, hdfsWriter, timedRollerPool,
+        privExecutor, sinkCounter, idleTimeout, closeCallback,
+        lookupPath, callTimeout, callTimeoutPool, retryInterval,
+        tryCount);
+    if (mockFs != null) {
+      bucketWriter.setFileSystem(mockFs);
+      bucketWriter.setMockStream(mockWriter);
+    }
+    return bucketWriter;
+  }
+
+  @Override
+  public void stop() {
+    // do not constrain close() calls with a timeout
+    synchronized (sfWritersLock) {
+      for (Entry<String, BucketWriter> entry : sfWriters.entrySet()) {
+        LOG.info("Closing {}", entry.getKey());
+
+        try {
+          entry.getValue().close();
+        } catch (Exception ex) {
+          LOG.warn("Exception while closing " + entry.getKey() + ". " +
+                  "Exception follows.", ex);
+          if (ex instanceof InterruptedException) {
+            Thread.currentThread().interrupt();
+          }
+        }
+      }
+    }
+
+    // shut down all our thread pools
+    ExecutorService[] toShutdown = { callTimeoutPool, timedRollerPool };
+    for (ExecutorService execService : toShutdown) {
+      execService.shutdown();
+      try {
+        while (execService.isTerminated() == false) {
+          execService.awaitTermination(
+                  Math.max(defaultCallTimeout, callTimeout), TimeUnit.MILLISECONDS);
+        }
+      } catch (InterruptedException ex) {
+        LOG.warn("shutdown interrupted on " + execService, ex);
+      }
+    }
+
+    callTimeoutPool = null;
+    timedRollerPool = null;
+
+    synchronized (sfWritersLock) {
+      sfWriters.clear();
+      sfWriters = null;
+    }
+    sinkCounter.stop();
+    super.stop();
+  }
+
+  @Override
+  public void start() {
+    String timeoutName = "hdfs-" + getName() + "-call-runner-%d";
+    callTimeoutPool = Executors.newFixedThreadPool(threadsPoolSize,
+            new ThreadFactoryBuilder().setNameFormat(timeoutName).build());
+
+    String rollerName = "hdfs-" + getName() + "-roll-timer-%d";
+    timedRollerPool = Executors.newScheduledThreadPool(rollTimerPoolSize,
+            new ThreadFactoryBuilder().setNameFormat(rollerName).build());
+
+    this.sfWriters = new WriterLinkedHashMap(maxOpenFiles);
+    sinkCounter.start();
+    super.start();
+  }
+
+  @Override
+  public String toString() {
+    return "{ Sink type:" + getClass().getSimpleName() + ", name:" + getName() +
+            " }";
+  }
+
+  @VisibleForTesting
+  void setBucketClock(Clock clock) {
+    BucketPath.setClock(clock);
+  }
+
+  @VisibleForTesting
+  void setMockFs(FileSystem mockFs) {
+    this.mockFs = mockFs;
+  }
+
+  @VisibleForTesting
+  void setMockWriter(HDFSWriter writer) {
+    this.mockWriter = writer;
+  }
+
+  @VisibleForTesting
+  int getTryCount() {
+    return tryCount;
+  }
+}
diff --git a/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSSequenceFile.java b/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSSequenceFile.java
new file mode 100644
index 0000000..c5430ba
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSSequenceFile.java
@@ -0,0 +1,122 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flume.sink.hdfs;
+
+import java.io.IOException;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FSDataOutputStream;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.LocalFileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.io.SequenceFile;
+import org.apache.hadoop.io.SequenceFile.CompressionType;
+import org.apache.hadoop.io.compress.CompressionCodec;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class HDFSSequenceFile extends AbstractHDFSWriter {
+
+  private static final Logger logger =
+      LoggerFactory.getLogger(HDFSSequenceFile.class);
+  private SequenceFile.Writer writer;
+  private String writeFormat;
+  private Context serializerContext;
+  private SequenceFileSerializer serializer;
+  private boolean useRawLocalFileSystem;
+  private FSDataOutputStream outStream = null;
+
+  public HDFSSequenceFile() {
+    writer = null;
+  }
+
+  @Override
+  public void configure(Context context) {
+    super.configure(context);
+
+    // use binary writable serialize by default
+    writeFormat = context.getString("hdfs.writeFormat",
+      SequenceFileSerializerType.Writable.name());
+    useRawLocalFileSystem = context.getBoolean("hdfs.useRawLocalFileSystem",
+        false);
+    serializerContext = new Context(
+            context.getSubProperties(SequenceFileSerializerFactory.CTX_PREFIX));
+    serializer = SequenceFileSerializerFactory
+            .getSerializer(writeFormat, serializerContext);
+    logger.info("writeFormat = " + writeFormat + ", UseRawLocalFileSystem = "
+        + useRawLocalFileSystem);
+  }
+
+  @Override
+  public void open(String filePath) throws IOException {
+    open(filePath, null, CompressionType.NONE);
+  }
+
+  @Override
+  public void open(String filePath, CompressionCodec codeC,
+      CompressionType compType) throws IOException {
+    Configuration conf = new Configuration();
+    Path dstPath = new Path(filePath);
+    FileSystem hdfs = dstPath.getFileSystem(conf);
+    open(dstPath, codeC, compType, conf, hdfs);
+  }
+
+  protected void open(Path dstPath, CompressionCodec codeC,
+      CompressionType compType, Configuration conf, FileSystem hdfs)
+          throws IOException {
+    if (useRawLocalFileSystem) {
+      if (hdfs instanceof LocalFileSystem) {
+        hdfs = ((LocalFileSystem)hdfs).getRaw();
+      } else {
+        logger.warn("useRawLocalFileSystem is set to true but file system " +
+            "is not of type LocalFileSystem: " + hdfs.getClass().getName());
+      }
+    }
+    if (conf.getBoolean("hdfs.append.support", false) == true && hdfs.isFile(dstPath)) {
+      outStream = hdfs.append(dstPath);
+    } else {
+      outStream = hdfs.create(dstPath);
+    }
+    writer = SequenceFile.createWriter(conf, outStream,
+        serializer.getKeyClass(), serializer.getValueClass(), compType, codeC);
+
+    registerCurrentStream(outStream, hdfs, dstPath);
+  }
+
+  @Override
+  public void append(Event e) throws IOException {
+    for (SequenceFileSerializer.Record record : serializer.serialize(e)) {
+      writer.append(record.getKey(), record.getValue());
+    }
+  }
+
+  @Override
+  public void sync() throws IOException {
+    writer.sync();
+    hflushOrSync(outStream);
+  }
+
+  @Override
+  public void close() throws IOException {
+    writer.close();
+    outStream.close();
+    unregisterCurrentStream();
+  }
+}
diff --git a/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSTextSerializer.java b/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSTextSerializer.java
new file mode 100644
index 0000000..32fd206
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSTextSerializer.java
@@ -0,0 +1,79 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.flume.sink.hdfs;
+
+import java.util.Collections;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.hadoop.io.Text;
+import org.apache.hadoop.io.LongWritable;
+
+public class HDFSTextSerializer implements SequenceFileSerializer {
+
+  private Text makeText(Event e) {
+    Text textObject = new Text();
+    textObject.set(e.getBody(), 0, e.getBody().length);
+    return textObject;
+  }
+
+  @Override
+  public Class<LongWritable> getKeyClass() {
+    return LongWritable.class;
+  }
+
+  @Override
+  public Class<Text> getValueClass() {
+    return Text.class;
+  }
+
+  @Override
+  public Iterable<Record> serialize(Event e) {
+    Object key = getKey(e);
+    Object value = getValue(e);
+    return Collections.singletonList(new Record(key, value));
+  }
+
+  private Object getKey(Event e) {
+    // Write the data to HDFS
+    String timestamp = e.getHeaders().get("timestamp");
+    long eventStamp;
+
+    if (timestamp == null) {
+      eventStamp = System.currentTimeMillis();
+    } else {
+      eventStamp = Long.valueOf(timestamp);
+    }
+    return new LongWritable(eventStamp);
+  }
+
+  private Object getValue(Event e) {
+    return makeText(e);
+  }
+
+  public static class Builder implements SequenceFileSerializer.Builder {
+
+    @Override
+    public SequenceFileSerializer build(Context context) {
+      return new HDFSTextSerializer();
+    }
+
+  }
+
+}
diff --git a/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSWritableSerializer.java b/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSWritableSerializer.java
new file mode 100644
index 0000000..b25a6ea
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSWritableSerializer.java
@@ -0,0 +1,77 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flume.sink.hdfs;
+
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.hadoop.io.BytesWritable;
+import org.apache.hadoop.io.LongWritable;
+
+import java.util.Collections;
+
+public class HDFSWritableSerializer implements SequenceFileSerializer {
+
+  private BytesWritable makeByteWritable(Event e) {
+    BytesWritable bytesObject = new BytesWritable();
+    bytesObject.set(e.getBody(), 0, e.getBody().length);
+    return bytesObject;
+  }
+
+  @Override
+  public Class<LongWritable> getKeyClass() {
+    return LongWritable.class;
+  }
+
+  @Override
+  public Class<BytesWritable> getValueClass() {
+    return BytesWritable.class;
+  }
+
+  @Override
+  public Iterable<Record> serialize(Event e) {
+    Object key = getKey(e);
+    Object value = getValue(e);
+    return Collections.singletonList(new Record(key, value));
+  }
+
+  private Object getKey(Event e) {
+    String timestamp = e.getHeaders().get("timestamp");
+    long eventStamp;
+
+    if (timestamp == null) {
+      eventStamp = System.currentTimeMillis();
+    } else {
+      eventStamp = Long.valueOf(timestamp);
+    }
+    return new LongWritable(eventStamp);
+  }
+
+  private Object getValue(Event e) {
+    return makeByteWritable(e);
+  }
+
+  public static class Builder implements SequenceFileSerializer.Builder {
+
+    @Override
+    public SequenceFileSerializer build(Context context) {
+      return new HDFSWritableSerializer();
+    }
+
+  }
+
+}
diff --git a/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSWriter.java b/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSWriter.java
new file mode 100644
index 0000000..44a984a
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSWriter.java
@@ -0,0 +1,47 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flume.sink.hdfs;
+
+import java.io.IOException;
+
+import org.apache.flume.Event;
+import org.apache.flume.annotations.InterfaceAudience;
+import org.apache.flume.annotations.InterfaceStability;
+import org.apache.flume.conf.Configurable;
+import org.apache.hadoop.io.SequenceFile.CompressionType;
+import org.apache.hadoop.io.compress.CompressionCodec;
+
+@InterfaceAudience.Private
+@InterfaceStability.Evolving
+public interface HDFSWriter extends Configurable {
+
+  public void open(String filePath) throws IOException;
+
+  public void open(String filePath, CompressionCodec codec,
+      CompressionType cType) throws IOException;
+
+  public void append(Event e) throws IOException;
+
+  public void sync() throws IOException;
+
+  public void close() throws IOException;
+
+  public boolean isUnderReplicated();
+
+}
diff --git a/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSWriterFactory.java b/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSWriterFactory.java
new file mode 100644
index 0000000..a90d536
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSWriterFactory.java
@@ -0,0 +1,43 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flume.sink.hdfs;
+
+import java.io.IOException;
+
+public class HDFSWriterFactory {
+  static final String SequenceFileType = "SequenceFile";
+  static final String DataStreamType = "DataStream";
+  static final String CompStreamType = "CompressedStream";
+
+  public HDFSWriterFactory() {
+
+  }
+
+  public HDFSWriter getWriter(String fileType) throws IOException {
+    if (fileType.equalsIgnoreCase(SequenceFileType)) {
+      return new HDFSSequenceFile();
+    } else if (fileType.equalsIgnoreCase(DataStreamType)) {
+      return new HDFSDataStream();
+    } else if (fileType.equalsIgnoreCase(CompStreamType)) {
+      return new HDFSCompressedDataStream();
+    } else {
+      throw new IOException("File type " + fileType + " not supported");
+    }
+  }
+}
diff --git a/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/KerberosUser.java b/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/KerberosUser.java
new file mode 100644
index 0000000..43297e2
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/KerberosUser.java
@@ -0,0 +1,72 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership. The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.flume.sink.hdfs;
+
+/**
+ * Simple Pair class used to define a unique (principal, keyTab) combination.
+ */
+public class KerberosUser {
+
+  private final String principal;
+  private final String keyTab;
+
+  public KerberosUser(String principal, String keyTab) {
+    this.principal = principal;
+    this.keyTab = keyTab;
+  }
+
+  public String getPrincipal() {
+    return principal;
+  }
+
+  public String getKeyTab() {
+    return keyTab;
+  }
+
+  @Override
+  public boolean equals(Object obj) {
+    if (obj == null) {
+      return false;
+    }
+    if (getClass() != obj.getClass()) {
+      return false;
+    }
+    final KerberosUser other = (KerberosUser) obj;
+    if ((this.principal == null) ?
+        (other.principal != null) :
+        !this.principal.equals(other.principal)) {
+      return false;
+    }
+    if ((this.keyTab == null) ? (other.keyTab != null) : !this.keyTab.equals(other.keyTab)) {
+      return false;
+    }
+    return true;
+  }
+
+  @Override
+  public int hashCode() {
+    int hash = 7;
+    hash = 41 * hash + (this.principal != null ? this.principal.hashCode() : 0);
+    hash = 41 * hash + (this.keyTab != null ? this.keyTab.hashCode() : 0);
+    return hash;
+  }
+
+  @Override
+  public String toString() {
+    return "{ principal: " + principal + ", keytab: " + keyTab + " }";
+  }
+}
diff --git a/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/SequenceFileSerializer.java b/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/SequenceFileSerializer.java
new file mode 100644
index 0000000..ec2b760
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/SequenceFileSerializer.java
@@ -0,0 +1,68 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flume.sink.hdfs;
+
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+
+public interface SequenceFileSerializer {
+
+  Class<?> getKeyClass();
+
+  Class<?> getValueClass();
+
+  /**
+   * Format the given event into zero, one or more SequenceFile records
+   *
+   * @param e
+   *         event
+   * @return a list of records corresponding to the given event
+   */
+  Iterable<Record> serialize(Event e);
+
+  /**
+   * Knows how to construct this output formatter.<br/>
+   * <b>Note: Implementations MUST provide a public a no-arg constructor.</b>
+   */
+  public interface Builder {
+    public SequenceFileSerializer build(Context context);
+  }
+
+  /**
+   * A key-value pair making up a record in an HDFS SequenceFile
+   */
+  public static class Record {
+    private final Object key;
+    private final Object value;
+
+    public Record(Object key, Object value) {
+      this.key = key;
+      this.value = value;
+    }
+
+    public Object getKey() {
+      return key;
+    }
+
+    public Object getValue() {
+      return value;
+    }
+  }
+
+}
diff --git a/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/SequenceFileSerializerFactory.java b/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/SequenceFileSerializerFactory.java
new file mode 100644
index 0000000..5678836
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/SequenceFileSerializerFactory.java
@@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.hdfs;
+
+import com.google.common.base.Preconditions;
+import org.apache.flume.Context;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class SequenceFileSerializerFactory {
+
+  private static final Logger logger =
+      LoggerFactory.getLogger(SequenceFileSerializerFactory.class);
+
+  /**
+   * {@link Context} prefix
+   */
+  static final String CTX_PREFIX = "writeFormat.";
+
+  @SuppressWarnings("unchecked")
+  static SequenceFileSerializer getSerializer(String formatType,
+                                              Context context) {
+
+    Preconditions.checkNotNull(formatType,
+        "serialize type must not be null");
+
+    // try to find builder class in enum of known formatters
+    SequenceFileSerializerType type;
+    try {
+      type = SequenceFileSerializerType.valueOf(formatType);
+    } catch (IllegalArgumentException e) {
+      logger.debug("Not in enum, loading builder class: {}", formatType);
+      type = SequenceFileSerializerType.Other;
+    }
+    Class<? extends SequenceFileSerializer.Builder> builderClass =
+        type.getBuilderClass();
+
+    // handle the case where they have specified their own builder in the config
+    if (builderClass == null) {
+      try {
+        Class c = Class.forName(formatType);
+        if (c != null && SequenceFileSerializer.Builder.class.isAssignableFrom(c)) {
+          builderClass = (Class<? extends SequenceFileSerializer.Builder>) c;
+        } else {
+          logger.error("Unable to instantiate Builder from {}", formatType);
+          return null;
+        }
+      } catch (ClassNotFoundException ex) {
+        logger.error("Class not found: " + formatType, ex);
+        return null;
+      } catch (ClassCastException ex) {
+        logger.error("Class does not extend " +
+            SequenceFileSerializer.Builder.class.getCanonicalName() + ": " +
+            formatType, ex);
+        return null;
+      }
+    }
+
+    // build the builder
+    SequenceFileSerializer.Builder builder;
+    try {
+      builder = builderClass.newInstance();
+    } catch (InstantiationException ex) {
+      logger.error("Cannot instantiate builder: " + formatType, ex);
+      return null;
+    } catch (IllegalAccessException ex) {
+      logger.error("Cannot instantiate builder: " + formatType, ex);
+      return null;
+    }
+
+    return builder.build(context);
+  }
+
+}
diff --git a/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/SequenceFileSerializerType.java b/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/SequenceFileSerializerType.java
new file mode 100644
index 0000000..2ad7689
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/SequenceFileSerializerType.java
@@ -0,0 +1,37 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.hdfs;
+
+public enum SequenceFileSerializerType {
+  Writable(HDFSWritableSerializer.Builder.class),
+  Text(HDFSTextSerializer.Builder.class),
+  Other(null);
+
+  private final Class<? extends SequenceFileSerializer.Builder> builderClass;
+
+  SequenceFileSerializerType(Class<? extends SequenceFileSerializer.Builder> builderClass) {
+    this.builderClass = builderClass;
+  }
+
+  public Class<? extends SequenceFileSerializer.Builder> getBuilderClass() {
+    return builderClass;
+  }
+
+}
+
diff --git a/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/HDFSBadDataStream.java b/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/HDFSBadDataStream.java
new file mode 100644
index 0000000..d325233
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/HDFSBadDataStream.java
@@ -0,0 +1,45 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.flume.sink.hdfs;
+
+import java.io.IOException;
+import org.apache.flume.Event;
+
+public class HDFSBadDataStream extends HDFSDataStream {
+  public class HDFSBadSeqWriter extends HDFSSequenceFile {
+    @Override
+    public void append(Event e) throws IOException {
+
+      if (e.getHeaders().containsKey("fault")) {
+        throw new IOException("Injected fault");
+      } else if (e.getHeaders().containsKey("slow")) {
+        long waitTime = Long.parseLong(e.getHeaders().get("slow"));
+        try {
+          Thread.sleep(waitTime);
+        } catch (InterruptedException eT) {
+          throw new IOException("append interrupted", eT);
+        }
+      }
+      super.append(e);
+    }
+
+  }
+
+}
diff --git a/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/HDFSTestSeqWriter.java b/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/HDFSTestSeqWriter.java
new file mode 100644
index 0000000..f1dadf1
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/HDFSTestSeqWriter.java
@@ -0,0 +1,77 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.flume.sink.hdfs;
+
+import org.apache.flume.Event;
+import org.apache.hadoop.io.SequenceFile.CompressionType;
+import org.apache.hadoop.io.compress.CompressionCodec;
+
+import java.io.IOException;
+
+public class HDFSTestSeqWriter extends HDFSSequenceFile {
+  protected volatile boolean closed;
+  protected volatile boolean opened;
+
+  private int openCount = 0;
+
+  HDFSTestSeqWriter(int openCount) {
+    this.openCount = openCount;
+  }
+
+  @Override
+  public void open(String filePath, CompressionCodec codeC, CompressionType compType)
+      throws IOException {
+    super.open(filePath, codeC, compType);
+    if (closed) {
+      opened = true;
+    }
+  }
+
+  @Override
+  public void append(Event e) throws IOException {
+
+    if (e.getHeaders().containsKey("fault")) {
+      throw new IOException("Injected fault");
+    } else if (e.getHeaders().containsKey("fault-once")) {
+      e.getHeaders().remove("fault-once");
+      throw new IOException("Injected fault");
+    } else if (e.getHeaders().containsKey("fault-until-reopen")) {
+      // opening first time.
+      if (openCount == 1) {
+        throw new IOException("Injected fault-until-reopen");
+      }
+    } else if (e.getHeaders().containsKey("slow")) {
+      long waitTime = Long.parseLong(e.getHeaders().get("slow"));
+      try {
+        Thread.sleep(waitTime);
+      } catch (InterruptedException eT) {
+        throw new IOException("append interrupted", eT);
+      }
+    }
+
+    super.append(e);
+  }
+
+  @Override
+  public void close() throws IOException {
+    closed = true;
+    super.close();
+  }
+}
diff --git a/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/HDFSTestWriterFactory.java b/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/HDFSTestWriterFactory.java
new file mode 100644
index 0000000..70bd9e6
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/HDFSTestWriterFactory.java
@@ -0,0 +1,42 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.flume.sink.hdfs;
+
+import java.io.IOException;
+import java.util.concurrent.atomic.AtomicInteger;
+
+public class HDFSTestWriterFactory extends HDFSWriterFactory {
+  static final String TestSequenceFileType = "SequenceFile";
+  static final String BadDataStreamType = "DataStream";
+
+  // so we can get a handle to this one in our test.
+  AtomicInteger openCount = new AtomicInteger(0);
+
+  @Override
+  public HDFSWriter getWriter(String fileType) throws IOException {
+    if (fileType == TestSequenceFileType) {
+      return new HDFSTestSeqWriter(openCount.incrementAndGet());
+    } else if (fileType == BadDataStreamType) {
+      return new HDFSBadDataStream();
+    } else {
+      throw new IOException("File type " + fileType + " not supported");
+    }
+  }
+}
diff --git a/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockDataStream.java b/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockDataStream.java
new file mode 100644
index 0000000..a85a99f
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockDataStream.java
@@ -0,0 +1,39 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.hdfs;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+
+import java.io.IOException;
+
+class MockDataStream extends HDFSDataStream {
+  private final FileSystem fs;
+
+  MockDataStream(FileSystem fs) {
+    this.fs = fs;
+  }
+
+  @Override
+  protected FileSystem getDfs(Configuration conf, Path dstPath) throws IOException {
+    return fs;
+  }
+
+}
diff --git a/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFileSystem.java b/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFileSystem.java
new file mode 100644
index 0000000..a079b83
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFileSystem.java
@@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.hdfs;
+
+import java.io.IOException;
+import java.net.URI;
+
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.fs.FSDataOutputStream;
+import org.apache.hadoop.fs.FileStatus;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.permission.FsPermission;
+import org.apache.hadoop.util.Progressable;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class MockFileSystem extends FileSystem {
+
+  private static final Logger logger =
+      LoggerFactory.getLogger(MockFileSystem.class);
+
+  FileSystem fs;
+  int numberOfRetriesRequired;
+  MockFsDataOutputStream latestOutputStream;
+  int currentRenameAttempts;
+  boolean closeSucceed = true;
+
+  public MockFileSystem(FileSystem fs, int numberOfRetriesRequired) {
+    this.fs = fs;
+    this.numberOfRetriesRequired = numberOfRetriesRequired;
+  }
+
+  public MockFileSystem(FileSystem fs,
+                        int numberOfRetriesRequired, boolean closeSucceed) {
+    this.fs = fs;
+    this.numberOfRetriesRequired = numberOfRetriesRequired;
+    this.closeSucceed = closeSucceed;
+  }
+
+  @Override
+  public FSDataOutputStream append(Path arg0, int arg1, Progressable arg2)
+      throws IOException {
+
+    latestOutputStream = new MockFsDataOutputStream(
+      fs.append(arg0, arg1, arg2), closeSucceed);
+
+    return latestOutputStream;
+  }
+
+  @Override
+  public FSDataOutputStream create(Path arg0) throws IOException {
+    latestOutputStream = new MockFsDataOutputStream(fs.create(arg0), closeSucceed);
+    return latestOutputStream;
+  }
+
+  @Override
+  public FSDataOutputStream create(Path arg0, FsPermission arg1, boolean arg2, int arg3,
+                                   short arg4, long arg5, Progressable arg6)
+      throws IOException {
+    throw new IOException("Not a real file system");
+  }
+
+  @Override
+  @Deprecated
+  public boolean delete(Path arg0) throws IOException {
+    return fs.delete(arg0);
+  }
+
+  @Override
+  public boolean delete(Path arg0, boolean arg1) throws IOException {
+    return fs.delete(arg0, arg1);
+  }
+
+  @Override
+  public FileStatus getFileStatus(Path arg0) throws IOException {
+    return fs.getFileStatus(arg0);
+  }
+
+  @Override
+  public URI getUri() {
+    return fs.getUri();
+  }
+
+  @Override
+  public Path getWorkingDirectory() {
+    return fs.getWorkingDirectory();
+  }
+
+  @Override
+  public FileStatus[] listStatus(Path arg0) throws IOException {
+    return fs.listStatus(arg0);
+  }
+
+  @Override
+  public boolean mkdirs(Path arg0, FsPermission arg1) throws IOException {
+    // TODO Auto-generated method stub
+    return fs.mkdirs(arg0, arg1);
+  }
+
+  @Override
+  public FSDataInputStream open(Path arg0, int arg1) throws IOException {
+    return fs.open(arg0, arg1);
+  }
+
+  @Override
+  public boolean rename(Path arg0, Path arg1) throws IOException {
+    currentRenameAttempts++;
+    logger.info("Attempting to Rename: '" + currentRenameAttempts + "' of '" +
+                numberOfRetriesRequired + "'");
+    if (currentRenameAttempts >= numberOfRetriesRequired || numberOfRetriesRequired == 0) {
+      logger.info("Renaming file");
+      return fs.rename(arg0, arg1);
+    } else {
+      throw new IOException("MockIOException");
+    }
+  }
+
+  @Override
+  public void setWorkingDirectory(Path arg0) {
+    fs.setWorkingDirectory(arg0);
+  }
+}
diff --git a/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFsDataOutputStream.java b/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFsDataOutputStream.java
new file mode 100644
index 0000000..f5d579c
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFsDataOutputStream.java
@@ -0,0 +1,49 @@
+/**
++ * Licensed to the Apache Software Foundation (ASF) under one
++ * or more contributor license agreements.  See the NOTICE file
++ * distributed with this work for additional information
++ * regarding copyright ownership.  The ASF licenses this file
++ * to you under the Apache License, Version 2.0 (the
++ * "License"); you may not use this file except in compliance
++ * with the License.  You may obtain a copy of the License at
++ *
++ *     http://www.apache.org/licenses/LICENSE-2.0
++ *
++ * Unless required by applicable law or agreed to in writing, software
++ * distributed under the License is distributed on an "AS IS" BASIS,
++ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
++ * See the License for the specific language governing permissions and
++ * limitations under the License.
++ */
+package org.apache.flume.sink.hdfs;
+
+import org.apache.hadoop.fs.FSDataOutputStream;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+
+public class MockFsDataOutputStream extends FSDataOutputStream {
+
+  private static final Logger logger =
+      LoggerFactory.getLogger(MockFsDataOutputStream.class);
+
+  boolean closeSucceed;
+
+  public MockFsDataOutputStream(FSDataOutputStream wrapMe, boolean closeSucceed)
+      throws IOException {
+    super(wrapMe.getWrappedStream(), null);
+    this.closeSucceed = closeSucceed;
+  }
+
+  @Override
+  public void close() throws IOException {
+    logger.info("Close Succeeded - " + closeSucceed);
+    if (closeSucceed) {
+      logger.info("closing file");
+      super.close();
+    } else {
+      throw new IOException("MockIOException");
+    }
+  }
+}
diff --git a/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockHDFSWriter.java b/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockHDFSWriter.java
new file mode 100644
index 0000000..05c4316
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockHDFSWriter.java
@@ -0,0 +1,95 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.hdfs;
+
+import java.io.IOException;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.hadoop.io.SequenceFile.CompressionType;
+import org.apache.hadoop.io.compress.CompressionCodec;
+
+public class MockHDFSWriter implements HDFSWriter {
+
+  private int filesOpened = 0;
+  private int filesClosed = 0;
+  private int bytesWritten = 0;
+  private int eventsWritten = 0;
+  private String filePath = null;
+
+  public int getFilesOpened() {
+    return filesOpened;
+  }
+
+  public int getFilesClosed() {
+    return filesClosed;
+  }
+
+  public int getBytesWritten() {
+    return bytesWritten;
+  }
+
+  public int getEventsWritten() {
+    return eventsWritten;
+  }
+
+  public String getOpenedFilePath() {
+    return filePath;
+  }
+
+  public void clear() {
+    filesOpened = 0;
+    filesClosed = 0;
+    bytesWritten = 0;
+    eventsWritten = 0;
+  }
+
+  public void configure(Context context) {
+    // no-op
+  }
+
+  public void open(String filePath) throws IOException {
+    this.filePath = filePath;
+    filesOpened++;
+  }
+
+  public void open(String filePath, CompressionCodec codec, CompressionType cType)
+      throws IOException {
+    this.filePath = filePath;
+    filesOpened++;
+  }
+
+  public void append(Event e) throws IOException {
+    eventsWritten++;
+    bytesWritten += e.getBody().length;
+  }
+
+  public void sync() throws IOException {
+    // does nothing
+  }
+
+  public void close() throws IOException {
+    filesClosed++;
+  }
+
+  @Override
+  public boolean isUnderReplicated() {
+    return false;
+  }
+
+}
diff --git a/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MyCustomSerializer.java b/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MyCustomSerializer.java
new file mode 100644
index 0000000..72164fd
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MyCustomSerializer.java
@@ -0,0 +1,58 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.flume.sink.hdfs;
+
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.hadoop.io.BytesWritable;
+import org.apache.hadoop.io.LongWritable;
+
+import java.util.Arrays;
+
+public class MyCustomSerializer implements SequenceFileSerializer {
+
+  @Override
+  public Class<LongWritable> getKeyClass() {
+    return LongWritable.class;
+  }
+
+  @Override
+  public Class<BytesWritable> getValueClass() {
+    return BytesWritable.class;
+  }
+
+  @Override
+  public Iterable<Record> serialize(Event e) {
+    return Arrays.asList(
+        new Record(new LongWritable(1234L), new BytesWritable(new byte[10])),
+        new Record(new LongWritable(4567L), new BytesWritable(new byte[20]))
+    );
+  }
+
+  public static class Builder implements SequenceFileSerializer.Builder {
+
+    @Override
+    public SequenceFileSerializer build(Context context) {
+      return new MyCustomSerializer();
+    }
+
+  }
+
+}
diff --git a/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestAvroEventSerializer.java b/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestAvroEventSerializer.java
new file mode 100644
index 0000000..6b38da2
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestAvroEventSerializer.java
@@ -0,0 +1,183 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.hdfs;
+
+import com.google.common.base.Charsets;
+import com.google.common.io.Files;
+import java.io.ByteArrayOutputStream;
+import java.io.File;
+import java.io.FileOutputStream;
+import java.io.IOException;
+import java.io.OutputStream;
+import java.util.Arrays;
+import org.apache.avro.Schema;
+import org.apache.avro.file.DataFileReader;
+import org.apache.avro.generic.GenericData;
+import org.apache.avro.generic.GenericDatumReader;
+import org.apache.avro.generic.GenericRecord;
+import org.apache.avro.generic.GenericRecordBuilder;
+import org.apache.avro.io.BinaryEncoder;
+import org.apache.avro.io.DatumReader;
+import org.apache.avro.io.EncoderFactory;
+import org.apache.avro.reflect.ReflectDatumWriter;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.event.EventBuilder;
+import org.apache.flume.serialization.AvroEventSerializerConfigurationConstants;
+import org.apache.flume.serialization.EventSerializer;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+import org.junit.After;
+
+public class TestAvroEventSerializer {
+
+  private File file;
+
+  @Before
+  public void setUp() throws Exception {
+    file = File.createTempFile(getClass().getSimpleName(), "");
+  }
+
+  @After
+  public void tearDown() throws Exception {
+    file.delete();
+  }
+
+  @Test
+  public void testNoCompression() throws IOException {
+    createAvroFile(file, null, false, false);
+    validateAvroFile(file);
+  }
+
+  @Test
+  public void testNullCompression() throws IOException {
+    createAvroFile(file, "null", false, false);
+    validateAvroFile(file);
+  }
+
+  @Test
+  public void testDeflateCompression() throws IOException {
+    createAvroFile(file, "deflate", false, false);
+    validateAvroFile(file);
+  }
+
+  @Test
+  public void testSnappyCompression() throws IOException {
+    createAvroFile(file, "snappy", false, false);
+    validateAvroFile(file);
+  }
+
+  @Test
+  public void testSchemaUrl() throws IOException {
+    createAvroFile(file, null, true, false);
+    validateAvroFile(file);
+  }
+
+  @Test
+  public void testStaticSchemaUrl() throws IOException {
+    createAvroFile(file,null,false, true);
+    validateAvroFile(file);
+  }
+
+  @Test
+  public void testBothUrls() throws IOException {
+    createAvroFile(file,null,true,true);
+    validateAvroFile(file);
+  }
+
+  public void createAvroFile(File file, String codec, boolean useSchemaUrl,
+                             boolean useStaticSchemaUrl) throws IOException {
+    // serialize a few events using the reflection-based avro serializer
+    OutputStream out = new FileOutputStream(file);
+
+    Context ctx = new Context();
+    if (codec != null) {
+      ctx.put("compressionCodec", codec);
+    }
+
+    Schema schema = Schema.createRecord("myrecord", null, null, false);
+    schema.setFields(Arrays.asList(new Schema.Field[]{
+        new Schema.Field("message", Schema.create(Schema.Type.STRING), null, null)
+    }));
+    GenericRecordBuilder recordBuilder = new GenericRecordBuilder(schema);
+    File schemaFile = null;
+    if (useSchemaUrl || useStaticSchemaUrl) {
+      schemaFile = File.createTempFile(getClass().getSimpleName(), ".avsc");
+      Files.write(schema.toString(), schemaFile, Charsets.UTF_8);
+    }
+
+    if (useStaticSchemaUrl) {
+      ctx.put(AvroEventSerializerConfigurationConstants.STATIC_SCHEMA_URL,
+              schemaFile.toURI().toURL().toExternalForm());
+    }
+
+    EventSerializer.Builder builder = new AvroEventSerializer.Builder();
+    EventSerializer serializer = builder.build(ctx, out);
+
+    serializer.afterCreate();
+    for (int i = 0; i < 3; i++) {
+      GenericRecord record = recordBuilder.set("message", "Hello " + i).build();
+      Event event = EventBuilder.withBody(serializeAvro(record, schema));
+      if (schemaFile == null && !useSchemaUrl) {
+        event.getHeaders().put(AvroEventSerializer.AVRO_SCHEMA_LITERAL_HEADER,
+            schema.toString());
+      } else if (useSchemaUrl) {
+        event.getHeaders().put(AvroEventSerializer.AVRO_SCHEMA_URL_HEADER,
+            schemaFile.toURI().toURL().toExternalForm());
+      }
+      serializer.write(event);
+    }
+    serializer.flush();
+    serializer.beforeClose();
+    out.flush();
+    out.close();
+    if (schemaFile != null ) {
+      schemaFile.delete();
+    }
+
+  }
+
+  private byte[] serializeAvro(Object datum, Schema schema) throws IOException {
+    ByteArrayOutputStream out = new ByteArrayOutputStream();
+    ReflectDatumWriter<Object> writer = new ReflectDatumWriter<Object>(schema);
+    BinaryEncoder encoder = EncoderFactory.get().binaryEncoder(out, null);
+    out.reset();
+    writer.write(datum, encoder);
+    encoder.flush();
+    return out.toByteArray();
+  }
+
+  public void validateAvroFile(File file) throws IOException {
+    // read the events back using GenericRecord
+    DatumReader<GenericRecord> reader = new GenericDatumReader<GenericRecord>();
+    DataFileReader<GenericRecord> fileReader =
+        new DataFileReader<GenericRecord>(file, reader);
+    GenericRecord record = new GenericData.Record(fileReader.getSchema());
+    int numEvents = 0;
+    while (fileReader.hasNext()) {
+      fileReader.next(record);
+      String bodyStr = record.get("message").toString();
+      System.out.println(bodyStr);
+      numEvents++;
+    }
+    fileReader.close();
+    Assert.assertEquals("Should have found a total of 3 events", 3, numEvents);
+  }
+}
diff --git a/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java b/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java
new file mode 100644
index 0000000..742deb0
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java
@@ -0,0 +1,450 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.hdfs;
+
+import com.google.common.base.Charsets;
+import org.apache.flume.Clock;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.auth.FlumeAuthenticationUtil;
+import org.apache.flume.auth.PrivilegedExecutor;
+import org.apache.flume.event.EventBuilder;
+import org.apache.flume.instrumentation.SinkCounter;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.io.SequenceFile;
+import org.apache.hadoop.io.SequenceFile.CompressionType;
+import org.apache.hadoop.io.compress.CompressionCodec;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.Test;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.File;
+import java.io.IOException;
+import java.util.Calendar;
+import java.util.concurrent.Executors;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.atomic.AtomicBoolean;
+
+public class TestBucketWriter {
+
+  private static Logger logger = LoggerFactory.getLogger(TestBucketWriter.class);
+  private Context ctx = new Context();
+
+  private static ScheduledExecutorService timedRollerPool;
+  private static PrivilegedExecutor proxy;
+
+  @BeforeClass
+  public static void setup() {
+    timedRollerPool = Executors.newSingleThreadScheduledExecutor();
+    proxy = FlumeAuthenticationUtil.getAuthenticator(null, null).proxyAs(null);
+  }
+
+  @AfterClass
+  public static void teardown() throws InterruptedException {
+    timedRollerPool.shutdown();
+    timedRollerPool.awaitTermination(2, TimeUnit.SECONDS);
+    timedRollerPool.shutdownNow();
+  }
+
+  @Test
+  public void testEventCountingRoller() throws IOException, InterruptedException {
+    int maxEvents = 100;
+    MockHDFSWriter hdfsWriter = new MockHDFSWriter();
+    BucketWriter bucketWriter = new BucketWriter(
+        0, 0, maxEvents, 0, ctx, "/tmp", "file", "", ".tmp", null, null,
+        SequenceFile.CompressionType.NONE, hdfsWriter, timedRollerPool, proxy,
+        new SinkCounter("test-bucket-writer-" + System.currentTimeMillis()), 0, null, null, 30000,
+        Executors.newSingleThreadExecutor(), 0, 0);
+
+    Event e = EventBuilder.withBody("foo", Charsets.UTF_8);
+    for (int i = 0; i < 1000; i++) {
+      bucketWriter.append(e);
+    }
+
+    logger.info("Number of events written: {}", hdfsWriter.getEventsWritten());
+    logger.info("Number of bytes written: {}", hdfsWriter.getBytesWritten());
+    logger.info("Number of files opened: {}", hdfsWriter.getFilesOpened());
+
+    Assert.assertEquals("events written", 1000, hdfsWriter.getEventsWritten());
+    Assert.assertEquals("bytes written", 3000, hdfsWriter.getBytesWritten());
+    Assert.assertEquals("files opened", 10, hdfsWriter.getFilesOpened());
+  }
+
+  @Test
+  public void testSizeRoller() throws IOException, InterruptedException {
+    int maxBytes = 300;
+    MockHDFSWriter hdfsWriter = new MockHDFSWriter();
+    BucketWriter bucketWriter = new BucketWriter(
+        0, maxBytes, 0, 0, ctx, "/tmp", "file", "", ".tmp", null, null,
+        SequenceFile.CompressionType.NONE, hdfsWriter, timedRollerPool, proxy,
+        new SinkCounter("test-bucket-writer-" + System.currentTimeMillis()), 0, null, null, 30000,
+        Executors.newSingleThreadExecutor(), 0, 0);
+
+    Event e = EventBuilder.withBody("foo", Charsets.UTF_8);
+    for (int i = 0; i < 1000; i++) {
+      bucketWriter.append(e);
+    }
+
+    logger.info("Number of events written: {}", hdfsWriter.getEventsWritten());
+    logger.info("Number of bytes written: {}", hdfsWriter.getBytesWritten());
+    logger.info("Number of files opened: {}", hdfsWriter.getFilesOpened());
+
+    Assert.assertEquals("events written", 1000, hdfsWriter.getEventsWritten());
+    Assert.assertEquals("bytes written", 3000, hdfsWriter.getBytesWritten());
+    Assert.assertEquals("files opened", 10, hdfsWriter.getFilesOpened());
+  }
+
+  @Test
+  public void testIntervalRoller() throws IOException, InterruptedException {
+    final int ROLL_INTERVAL = 1; // seconds
+    final int NUM_EVENTS = 10;
+    final AtomicBoolean calledBack = new AtomicBoolean(false);
+
+    MockHDFSWriter hdfsWriter = new MockHDFSWriter();
+    BucketWriter bucketWriter = new BucketWriter(
+        ROLL_INTERVAL, 0, 0, 0, ctx, "/tmp", "file", "", ".tmp", null, null,
+        SequenceFile.CompressionType.NONE, hdfsWriter, timedRollerPool, proxy,
+        new SinkCounter("test-bucket-writer-" + System.currentTimeMillis()), 0,
+        new HDFSEventSink.WriterCallback() {
+          @Override
+          public void run(String filePath) {
+            calledBack.set(true);
+          }
+        }, null, 30000, Executors.newSingleThreadExecutor(), 0, 0);
+
+    Event e = EventBuilder.withBody("foo", Charsets.UTF_8);
+    long startNanos = System.nanoTime();
+    for (int i = 0; i < NUM_EVENTS - 1; i++) {
+      bucketWriter.append(e);
+    }
+
+    // sleep to force a roll... wait 2x interval just to be sure
+    Thread.sleep(2 * ROLL_INTERVAL * 1000L);
+
+    Assert.assertTrue(bucketWriter.closed);
+    Assert.assertTrue(calledBack.get());
+
+    bucketWriter = new BucketWriter(
+        ROLL_INTERVAL, 0, 0, 0, ctx, "/tmp", "file", "", ".tmp", null, null,
+        SequenceFile.CompressionType.NONE, hdfsWriter, timedRollerPool, proxy,
+        new SinkCounter("test-bucket-writer-" + System.currentTimeMillis()), 0, null, null, 30000,
+        Executors.newSingleThreadExecutor(), 0, 0);
+    // write one more event (to reopen a new file so we will roll again later)
+    bucketWriter.append(e);
+
+    long elapsedMillis = TimeUnit.MILLISECONDS.convert(
+        System.nanoTime() - startNanos, TimeUnit.NANOSECONDS);
+    long elapsedSeconds = elapsedMillis / 1000L;
+
+    logger.info("Time elapsed: {} milliseconds", elapsedMillis);
+    logger.info("Number of events written: {}", hdfsWriter.getEventsWritten());
+    logger.info("Number of bytes written: {}", hdfsWriter.getBytesWritten());
+    logger.info("Number of files opened: {}", hdfsWriter.getFilesOpened());
+    logger.info("Number of files closed: {}", hdfsWriter.getFilesClosed());
+
+    Assert.assertEquals("events written", NUM_EVENTS,
+        hdfsWriter.getEventsWritten());
+    Assert.assertEquals("bytes written", e.getBody().length * NUM_EVENTS,
+        hdfsWriter.getBytesWritten());
+    Assert.assertEquals("files opened", 2, hdfsWriter.getFilesOpened());
+
+    // before auto-roll
+    Assert.assertEquals("files closed", 1, hdfsWriter.getFilesClosed());
+
+    logger.info("Waiting for roll...");
+    Thread.sleep(2 * ROLL_INTERVAL * 1000L);
+
+    logger.info("Number of files closed: {}", hdfsWriter.getFilesClosed());
+    Assert.assertEquals("files closed", 2, hdfsWriter.getFilesClosed());
+  }
+
+  @Test
+  public void testIntervalRollerBug() throws IOException, InterruptedException {
+    final int ROLL_INTERVAL = 1; // seconds
+    final int NUM_EVENTS = 10;
+
+    HDFSWriter hdfsWriter = new HDFSWriter() {
+      private volatile boolean open = false;
+
+      public void configure(Context context) {
+      }
+
+      public void sync() throws IOException {
+        if (!open) {
+          throw new IOException("closed");
+        }
+      }
+
+      public void open(String filePath, CompressionCodec codec, CompressionType cType)
+          throws IOException {
+        open = true;
+      }
+
+      public void open(String filePath) throws IOException {
+        open = true;
+      }
+
+      public void close() throws IOException {
+        open = false;
+      }
+
+      @Override
+      public boolean isUnderReplicated() {
+        return false;
+      }
+
+      public void append(Event e) throws IOException {
+        // we just re-open in append if closed
+        open = true;
+      }
+    };
+
+    HDFSTextSerializer serializer = new HDFSTextSerializer();
+    File tmpFile = File.createTempFile("flume", "test");
+    tmpFile.deleteOnExit();
+    String path = tmpFile.getParent();
+    String name = tmpFile.getName();
+
+    BucketWriter bucketWriter = new BucketWriter(
+        ROLL_INTERVAL, 0, 0, 0, ctx, path, name, "", ".tmp", null, null,
+        SequenceFile.CompressionType.NONE, hdfsWriter, timedRollerPool, proxy,
+        new SinkCounter("test-bucket-writer-" + System.currentTimeMillis()), 0, null, null, 30000,
+        Executors.newSingleThreadExecutor(), 0, 0);
+
+    Event e = EventBuilder.withBody("foo", Charsets.UTF_8);
+    for (int i = 0; i < NUM_EVENTS - 1; i++) {
+      bucketWriter.append(e);
+    }
+
+    // sleep to force a roll... wait 2x interval just to be sure
+    Thread.sleep(2 * ROLL_INTERVAL * 1000L);
+
+    bucketWriter.flush(); // throws closed exception
+  }
+
+  @Test
+  public void testFileSuffixNotGiven() throws IOException, InterruptedException {
+    final int ROLL_INTERVAL = 1000; // seconds. Make sure it doesn't change in course of test
+    final String suffix = null;
+
+    MockHDFSWriter hdfsWriter = new MockHDFSWriter();
+    BucketWriter bucketWriter = new BucketWriter(
+        ROLL_INTERVAL, 0, 0, 0, ctx, "/tmp", "file", "", ".tmp", suffix, null,
+        SequenceFile.CompressionType.NONE, hdfsWriter, timedRollerPool, proxy,
+        new SinkCounter("test-bucket-writer-" + System.currentTimeMillis()), 0, null, null, 30000,
+        Executors.newSingleThreadExecutor(), 0, 0);
+
+    // Need to override system time use for test so we know what to expect
+    final long testTime = System.currentTimeMillis();
+    Clock testClock = new Clock() {
+      public long currentTimeMillis() {
+        return testTime;
+      }
+    };
+    bucketWriter.setClock(testClock);
+
+    Event e = EventBuilder.withBody("foo", Charsets.UTF_8);
+    bucketWriter.append(e);
+
+    Assert.assertTrue("Incorrect suffix", hdfsWriter.getOpenedFilePath().endsWith(
+        Long.toString(testTime + 1) + ".tmp"));
+  }
+
+  @Test
+  public void testFileSuffixGiven() throws IOException, InterruptedException {
+    final int ROLL_INTERVAL = 1000; // seconds. Make sure it doesn't change in course of test
+    final String suffix = ".avro";
+
+    MockHDFSWriter hdfsWriter = new MockHDFSWriter();
+    BucketWriter bucketWriter = new BucketWriter(
+        ROLL_INTERVAL, 0, 0, 0, ctx, "/tmp", "file", "", ".tmp", suffix, null,
+        SequenceFile.CompressionType.NONE, hdfsWriter, timedRollerPool, proxy,
+        new SinkCounter("test-bucket-writer-" + System.currentTimeMillis()), 0, null, null, 30000,
+        Executors.newSingleThreadExecutor(), 0, 0);
+
+    // Need to override system time use for test so we know what to expect
+
+    final long testTime = System.currentTimeMillis();
+
+    Clock testClock = new Clock() {
+      public long currentTimeMillis() {
+        return testTime;
+      }
+    };
+    bucketWriter.setClock(testClock);
+
+    Event e = EventBuilder.withBody("foo", Charsets.UTF_8);
+    bucketWriter.append(e);
+
+    Assert.assertTrue("Incorrect suffix",hdfsWriter.getOpenedFilePath().endsWith(
+        Long.toString(testTime + 1) + suffix + ".tmp"));
+  }
+
+  @Test
+  public void testFileSuffixCompressed()
+      throws IOException, InterruptedException {
+    final int ROLL_INTERVAL = 1000; // seconds. Make sure it doesn't change in course of test
+    final String suffix = ".foo";
+
+    MockHDFSWriter hdfsWriter = new MockHDFSWriter();
+    BucketWriter bucketWriter = new BucketWriter(
+        ROLL_INTERVAL, 0, 0, 0, ctx, "/tmp", "file", "", ".tmp", suffix,
+        HDFSEventSink.getCodec("gzip"), SequenceFile.CompressionType.BLOCK, hdfsWriter,
+        timedRollerPool, proxy, new SinkCounter("test-bucket-writer-" + System.currentTimeMillis()),
+        0, null, null, 30000, Executors.newSingleThreadExecutor(), 0, 0
+    );
+
+    // Need to override system time use for test so we know what to expect
+    final long testTime = System.currentTimeMillis();
+
+    Clock testClock = new Clock() {
+      public long currentTimeMillis() {
+        return testTime;
+      }
+    };
+    bucketWriter.setClock(testClock);
+
+    Event e = EventBuilder.withBody("foo", Charsets.UTF_8);
+    bucketWriter.append(e);
+
+    Assert.assertTrue("Incorrect suffix", hdfsWriter.getOpenedFilePath().endsWith(
+        Long.toString(testTime + 1) + suffix + ".tmp"));
+  }
+
+  @Test
+  public void testInUsePrefix() throws IOException, InterruptedException {
+    final int ROLL_INTERVAL = 1000; // seconds. Make sure it doesn't change in course of test
+    final String PREFIX = "BRNO_IS_CITY_IN_CZECH_REPUBLIC";
+
+    MockHDFSWriter hdfsWriter = new MockHDFSWriter();
+    HDFSTextSerializer formatter = new HDFSTextSerializer();
+    BucketWriter bucketWriter = new BucketWriter(
+        ROLL_INTERVAL, 0, 0, 0, ctx, "/tmp", "file", PREFIX, ".tmp", null, null,
+        SequenceFile.CompressionType.NONE, hdfsWriter, timedRollerPool, proxy,
+        new SinkCounter("test-bucket-writer-" + System.currentTimeMillis()), 0, null, null, 30000,
+        Executors.newSingleThreadExecutor(), 0, 0);
+
+    Event e = EventBuilder.withBody("foo", Charsets.UTF_8);
+    bucketWriter.append(e);
+
+    Assert.assertTrue("Incorrect in use prefix", hdfsWriter.getOpenedFilePath().contains(PREFIX));
+  }
+
+  @Test
+  public void testInUseSuffix() throws IOException, InterruptedException {
+    final int ROLL_INTERVAL = 1000; // seconds. Make sure it doesn't change in course of test
+    final String SUFFIX = "WELCOME_TO_THE_HELLMOUNTH";
+
+    MockHDFSWriter hdfsWriter = new MockHDFSWriter();
+    HDFSTextSerializer serializer = new HDFSTextSerializer();
+    BucketWriter bucketWriter = new BucketWriter(
+        ROLL_INTERVAL, 0, 0, 0, ctx, "/tmp", "file", "", SUFFIX, null, null,
+        SequenceFile.CompressionType.NONE, hdfsWriter, timedRollerPool, proxy,
+        new SinkCounter("test-bucket-writer-" + System.currentTimeMillis()), 0, null, null, 30000,
+        Executors.newSingleThreadExecutor(), 0, 0);
+
+    Event e = EventBuilder.withBody("foo", Charsets.UTF_8);
+    bucketWriter.append(e);
+
+    Assert.assertTrue("Incorrect in use suffix", hdfsWriter.getOpenedFilePath().contains(SUFFIX));
+  }
+
+  @Test
+  public void testCallbackOnClose() throws IOException, InterruptedException {
+    final int ROLL_INTERVAL = 1000; // seconds. Make sure it doesn't change in course of test
+    final String SUFFIX = "WELCOME_TO_THE_EREBOR";
+    final AtomicBoolean callbackCalled = new AtomicBoolean(false);
+
+    MockHDFSWriter hdfsWriter = new MockHDFSWriter();
+    BucketWriter bucketWriter = new BucketWriter(
+        ROLL_INTERVAL, 0, 0, 0, ctx, "/tmp", "file", "", SUFFIX, null, null,
+        SequenceFile.CompressionType.NONE, hdfsWriter, timedRollerPool, proxy,
+        new SinkCounter("test-bucket-writer-" + System.currentTimeMillis()), 0,
+        new HDFSEventSink.WriterCallback() {
+          @Override
+          public void run(String filePath) {
+            callbackCalled.set(true);
+          }
+        }, "blah", 30000, Executors.newSingleThreadExecutor(), 0, 0);
+
+    Event e = EventBuilder.withBody("foo", Charsets.UTF_8);
+    bucketWriter.append(e);
+    bucketWriter.close(true);
+
+    Assert.assertTrue(callbackCalled.get());
+  }
+
+
+
+  @Test
+  public void testSequenceFileRenameRetries() throws Exception {
+    SequenceFileRenameRetryCoreTest(1, true);
+    SequenceFileRenameRetryCoreTest(5, true);
+    SequenceFileRenameRetryCoreTest(2, true);
+
+    SequenceFileRenameRetryCoreTest(1, false);
+    SequenceFileRenameRetryCoreTest(5, false);
+    SequenceFileRenameRetryCoreTest(2, false);
+  }
+
+  public void SequenceFileRenameRetryCoreTest(int numberOfRetriesRequired, boolean closeSucceed)
+      throws Exception {
+    String hdfsPath = "file:///tmp/flume-test." +
+                      Calendar.getInstance().getTimeInMillis() +
+                      "." + Thread.currentThread().getId();
+
+    Context context = new Context();
+    Configuration conf = new Configuration();
+    FileSystem fs = FileSystem.get(conf);
+    Path dirPath = new Path(hdfsPath);
+    fs.delete(dirPath, true);
+    fs.mkdirs(dirPath);
+    context.put("hdfs.path", hdfsPath);
+    context.put("hdfs.closeTries", String.valueOf(numberOfRetriesRequired));
+    context.put("hdfs.rollCount", "1");
+    context.put("hdfs.retryInterval", "1");
+    context.put("hdfs.callTimeout", Long.toString(1000));
+    MockFileSystem mockFs = new MockFileSystem(fs, numberOfRetriesRequired, closeSucceed);
+    BucketWriter bucketWriter = new BucketWriter(
+        0, 0, 1, 1, ctx, hdfsPath, hdfsPath, "singleBucket", ".tmp", null, null,
+        null, new MockDataStream(mockFs), timedRollerPool, proxy,
+        new SinkCounter("test-bucket-writer-" + System.currentTimeMillis()), 0, null, null, 30000,
+        Executors.newSingleThreadExecutor(), 1, numberOfRetriesRequired);
+
+    bucketWriter.setFileSystem(mockFs);
+    // At this point, we checked if isFileClosed is available in
+    // this JVM, so lets make it check again.
+    Event event = EventBuilder.withBody("test", Charsets.UTF_8);
+    bucketWriter.append(event);
+    // This is what triggers the close, so a 2nd append is required :/
+    bucketWriter.append(event);
+
+    TimeUnit.SECONDS.sleep(numberOfRetriesRequired + 2);
+
+    Assert.assertTrue("Expected " + numberOfRetriesRequired + " " +
+                      "but got " + bucketWriter.renameTries.get(),
+                      bucketWriter.renameTries.get() == numberOfRetriesRequired);
+  }
+}
diff --git a/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSCompressedDataStream.java b/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSCompressedDataStream.java
new file mode 100644
index 0000000..80f199b
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSCompressedDataStream.java
@@ -0,0 +1,141 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flume.sink.hdfs;
+
+import java.io.File;
+import java.io.FileInputStream;
+import java.nio.ByteBuffer;
+import java.nio.charset.CharsetDecoder;
+import java.util.List;
+import java.util.zip.GZIPInputStream;
+
+import org.apache.avro.file.DataFileStream;
+import org.apache.avro.generic.GenericData;
+import org.apache.avro.generic.GenericDatumReader;
+import org.apache.avro.generic.GenericRecord;
+import org.apache.avro.io.DatumReader;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.event.EventBuilder;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.io.SequenceFile;
+import org.apache.hadoop.io.compress.CompressionCodecFactory;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import com.google.common.base.Charsets;
+import com.google.common.collect.Lists;
+
+public class TestHDFSCompressedDataStream {
+
+  private static final Logger logger =
+      LoggerFactory.getLogger(TestHDFSCompressedDataStream.class);
+
+  private File file;
+  private String fileURI;
+  private CompressionCodecFactory factory;
+
+  @Before
+  public void init() throws Exception {
+    this.file = new File("target/test/data/foo.gz");
+    this.fileURI = file.getAbsoluteFile().toURI().toString();
+    logger.info("File URI: {}", fileURI);
+
+    Configuration conf = new Configuration();
+    // local FS must be raw in order to be Syncable
+    conf.set("fs.file.impl", "org.apache.hadoop.fs.RawLocalFileSystem");
+    Path path = new Path(fileURI);
+    path.getFileSystem(conf); // get FS with our conf cached
+
+    this.factory = new CompressionCodecFactory(conf);
+  }
+
+  // make sure the data makes it to disk if we sync() the data stream
+  @Test
+  public void testGzipDurability() throws Exception {
+    Context context = new Context();
+    HDFSCompressedDataStream writer = new HDFSCompressedDataStream();
+    writer.configure(context);
+    writer.open(fileURI, factory.getCodec(new Path(fileURI)),
+        SequenceFile.CompressionType.BLOCK);
+
+    String[] bodies = { "yarf!" };
+    writeBodies(writer, bodies);
+
+    byte[] buf = new byte[256];
+    GZIPInputStream cmpIn = new GZIPInputStream(new FileInputStream(file));
+    int len = cmpIn.read(buf);
+    String result = new String(buf, 0, len, Charsets.UTF_8);
+    result = result.trim(); // BodyTextEventSerializer adds a newline
+
+    Assert.assertEquals("input and output must match", bodies[0], result);
+  }
+
+  @Test
+  public void testGzipDurabilityWithSerializer() throws Exception {
+    Context context = new Context();
+    context.put("serializer", "AVRO_EVENT");
+
+    HDFSCompressedDataStream writer = new HDFSCompressedDataStream();
+    writer.configure(context);
+
+    writer.open(fileURI, factory.getCodec(new Path(fileURI)),
+        SequenceFile.CompressionType.BLOCK);
+
+    String[] bodies = { "yarf!", "yarfing!" };
+    writeBodies(writer, bodies);
+
+    int found = 0;
+    int expected = bodies.length;
+    List<String> expectedBodies = Lists.newArrayList(bodies);
+
+    GZIPInputStream cmpIn = new GZIPInputStream(new FileInputStream(file));
+    DatumReader<GenericRecord> reader = new GenericDatumReader<GenericRecord>();
+    DataFileStream<GenericRecord> avroStream =
+        new DataFileStream<GenericRecord>(cmpIn, reader);
+    GenericRecord record = new GenericData.Record(avroStream.getSchema());
+    while (avroStream.hasNext()) {
+      avroStream.next(record);
+      CharsetDecoder decoder = Charsets.UTF_8.newDecoder();
+      String bodyStr = decoder.decode((ByteBuffer) record.get("body"))
+          .toString();
+      expectedBodies.remove(bodyStr);
+      found++;
+    }
+    avroStream.close();
+    cmpIn.close();
+
+    Assert.assertTrue("Found = " + found + ", Expected = " + expected
+        + ", Left = " + expectedBodies.size() + " " + expectedBodies,
+        expectedBodies.size() == 0);
+  }
+
+  private void writeBodies(HDFSCompressedDataStream writer, String... bodies)
+      throws Exception {
+    for (String body : bodies) {
+      Event evt = EventBuilder.withBody(body, Charsets.UTF_8);
+      writer.append(evt);
+    }
+    writer.sync();
+  }
+}
diff --git a/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java b/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java
new file mode 100644
index 0000000..782cf47
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java
@@ -0,0 +1,1548 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flume.sink.hdfs;
+
+import java.io.BufferedReader;
+import java.io.File;
+import java.io.IOException;
+import java.io.InputStreamReader;
+import java.nio.ByteBuffer;
+import java.nio.charset.CharsetDecoder;
+import java.util.Arrays;
+import java.util.Calendar;
+import java.util.Collection;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.UUID;
+import java.util.concurrent.TimeUnit;
+
+import com.google.common.collect.Maps;
+import org.apache.avro.file.DataFileStream;
+import org.apache.avro.generic.GenericData;
+import org.apache.avro.generic.GenericDatumReader;
+import org.apache.avro.generic.GenericRecord;
+import org.apache.avro.io.DatumReader;
+import org.apache.commons.lang.StringUtils;
+import org.apache.flume.Channel;
+import org.apache.flume.Clock;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.EventDeliveryException;
+import org.apache.flume.Sink.Status;
+import org.apache.flume.SystemClock;
+import org.apache.flume.Transaction;
+import org.apache.flume.channel.MemoryChannel;
+import org.apache.flume.conf.Configurables;
+import org.apache.flume.event.EventBuilder;
+import org.apache.flume.event.SimpleEvent;
+import org.apache.flume.lifecycle.LifecycleException;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.CommonConfigurationKeys;
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.fs.FileStatus;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.FileUtil;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.io.BytesWritable;
+import org.apache.hadoop.io.LongWritable;
+import org.apache.hadoop.io.SequenceFile;
+import org.apache.hadoop.security.UserGroupInformation;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import com.google.common.base.Charsets;
+import com.google.common.collect.Lists;
+
+public class TestHDFSEventSink {
+
+  private HDFSEventSink sink;
+  private String testPath;
+  private static final Logger LOG = LoggerFactory
+      .getLogger(HDFSEventSink.class);
+
+  static {
+    System.setProperty("java.security.krb5.realm", "flume");
+    System.setProperty("java.security.krb5.kdc", "blah");
+  }
+
+  private void dirCleanup() {
+    Configuration conf = new Configuration();
+    try {
+      FileSystem fs = FileSystem.get(conf);
+      Path dirPath = new Path(testPath);
+      if (fs.exists(dirPath)) {
+        fs.delete(dirPath, true);
+      }
+    } catch (IOException eIO) {
+      LOG.warn("IO Error in test cleanup", eIO);
+    }
+  }
+
+  // TODO: use System.getProperty("file.separator") instead of hardcoded '/'
+  @Before
+  public void setUp() {
+    LOG.debug("Starting...");
+    /*
+     * FIXME: Use a dynamic path to support concurrent test execution. Also,
+     * beware of the case where this path is used for something or when the
+     * Hadoop config points at file:/// rather than hdfs://. We need to find a
+     * better way of testing HDFS related functionality.
+     */
+    testPath = "file:///tmp/flume-test."
+        + Calendar.getInstance().getTimeInMillis() + "."
+        + Thread.currentThread().getId();
+
+    sink = new HDFSEventSink();
+    sink.setName("HDFSEventSink-" + UUID.randomUUID().toString());
+    dirCleanup();
+  }
+
+  @After
+  public void tearDown() {
+    if (System.getenv("hdfs_keepFiles") == null) dirCleanup();
+  }
+
+  @Test
+  public void testTextBatchAppend() throws Exception {
+    doTestTextBatchAppend(false);
+  }
+
+  @Test
+  public void testTextBatchAppendRawFS() throws Exception {
+    doTestTextBatchAppend(true);
+  }
+
+  public void doTestTextBatchAppend(boolean useRawLocalFileSystem)
+      throws Exception {
+    LOG.debug("Starting...");
+
+    final long rollCount = 10;
+    final long batchSize = 2;
+    final String fileName = "FlumeData";
+    String newPath = testPath + "/singleTextBucket";
+    int totalEvents = 0;
+    int i = 1, j = 1;
+
+    // clear the test directory
+    Configuration conf = new Configuration();
+    FileSystem fs = FileSystem.get(conf);
+    Path dirPath = new Path(newPath);
+    fs.delete(dirPath, true);
+    fs.mkdirs(dirPath);
+
+    Context context = new Context();
+
+    // context.put("hdfs.path", testPath + "/%Y-%m-%d/%H");
+    context.put("hdfs.path", newPath);
+    context.put("hdfs.filePrefix", fileName);
+    context.put("hdfs.rollCount", String.valueOf(rollCount));
+    context.put("hdfs.rollInterval", "0");
+    context.put("hdfs.rollSize", "0");
+    context.put("hdfs.batchSize", String.valueOf(batchSize));
+    context.put("hdfs.writeFormat", "Text");
+    context.put("hdfs.useRawLocalFileSystem",
+        Boolean.toString(useRawLocalFileSystem));
+    context.put("hdfs.fileType", "DataStream");
+
+    Configurables.configure(sink, context);
+
+    Channel channel = new MemoryChannel();
+    Configurables.configure(channel, context);
+
+    sink.setChannel(channel);
+    sink.start();
+
+    Calendar eventDate = Calendar.getInstance();
+    List<String> bodies = Lists.newArrayList();
+
+    // push the event batches into channel to roll twice
+    for (i = 1; i <= (rollCount * 10) / batchSize; i++) {
+      Transaction txn = channel.getTransaction();
+      txn.begin();
+      for (j = 1; j <= batchSize; j++) {
+        Event event = new SimpleEvent();
+        eventDate.clear();
+        eventDate.set(2011, i, i, i, 0); // yy mm dd
+        String body = "Test." + i + "." + j;
+        event.setBody(body.getBytes());
+        bodies.add(body);
+        channel.put(event);
+        totalEvents++;
+      }
+      txn.commit();
+      txn.close();
+
+      // execute sink to process the events
+      sink.process();
+    }
+
+    sink.stop();
+
+    // loop through all the files generated and check their contains
+    FileStatus[] dirStat = fs.listStatus(dirPath);
+    Path[] fList = FileUtil.stat2Paths(dirStat);
+
+    // check that the roll happened correctly for the given data
+    long expectedFiles = totalEvents / rollCount;
+    if (totalEvents % rollCount > 0) expectedFiles++;
+    Assert.assertEquals("num files wrong, found: " +
+        Lists.newArrayList(fList), expectedFiles, fList.length);
+    // check the contents of the all files
+    verifyOutputTextFiles(fs, conf, dirPath.toUri().getPath(), fileName, bodies);
+  }
+
+  @Test
+  public void testLifecycle() throws InterruptedException, LifecycleException {
+    LOG.debug("Starting...");
+    Context context = new Context();
+
+    context.put("hdfs.path", testPath);
+    /*
+     * context.put("hdfs.rollInterval", String.class);
+     * context.get("hdfs.rollSize", String.class); context.get("hdfs.rollCount",
+     * String.class);
+     */
+    Configurables.configure(sink, context);
+
+    sink.setChannel(new MemoryChannel());
+
+    sink.start();
+    sink.stop();
+  }
+
+  @Test
+  public void testEmptyChannelResultsInStatusBackoff()
+      throws InterruptedException, LifecycleException, EventDeliveryException {
+    LOG.debug("Starting...");
+    Context context = new Context();
+    Channel channel = new MemoryChannel();
+    context.put("hdfs.path", testPath);
+    context.put("keep-alive", "0");
+    Configurables.configure(sink, context);
+    Configurables.configure(channel, context);
+    sink.setChannel(channel);
+    sink.start();
+    Assert.assertEquals(Status.BACKOFF, sink.process());
+    sink.stop();
+  }
+
+  @Test
+  public void testKerbFileAccess() throws InterruptedException,
+      LifecycleException, EventDeliveryException, IOException {
+    LOG.debug("Starting testKerbFileAccess() ...");
+    final String fileName = "FlumeData";
+    final long rollCount = 5;
+    final long batchSize = 2;
+    String newPath = testPath + "/singleBucket";
+    String kerbConfPrincipal = "user1/localhost@EXAMPLE.COM";
+    String kerbKeytab = "/usr/lib/flume/nonexistkeytabfile";
+
+    //turn security on
+    Configuration conf = new Configuration();
+    conf.set(CommonConfigurationKeys.HADOOP_SECURITY_AUTHENTICATION,
+        "kerberos");
+    UserGroupInformation.setConfiguration(conf);
+
+    Context context = new Context();
+    context.put("hdfs.path", newPath);
+    context.put("hdfs.filePrefix", fileName);
+    context.put("hdfs.rollCount", String.valueOf(rollCount));
+    context.put("hdfs.batchSize", String.valueOf(batchSize));
+    context.put("hdfs.kerberosPrincipal", kerbConfPrincipal);
+    context.put("hdfs.kerberosKeytab", kerbKeytab);
+
+    try {
+      Configurables.configure(sink, context);
+      Assert.fail("no exception thrown");
+    } catch (IllegalArgumentException expected) {
+      Assert.assertTrue(expected.getMessage().contains(
+          "Keytab is not a readable file"));
+    } finally {
+      //turn security off
+      conf.set(CommonConfigurationKeys.HADOOP_SECURITY_AUTHENTICATION,
+          "simple");
+      UserGroupInformation.setConfiguration(conf);
+    }
+  }
+
+  @Test
+  public void testTextAppend() throws InterruptedException, LifecycleException,
+      EventDeliveryException, IOException {
+
+    LOG.debug("Starting...");
+    final long rollCount = 3;
+    final long batchSize = 2;
+    final String fileName = "FlumeData";
+    String newPath = testPath + "/singleTextBucket";
+    int totalEvents = 0;
+    int i = 1, j = 1;
+
+    // clear the test directory
+    Configuration conf = new Configuration();
+    FileSystem fs = FileSystem.get(conf);
+    Path dirPath = new Path(newPath);
+    fs.delete(dirPath, true);
+    fs.mkdirs(dirPath);
+
+    Context context = new Context();
+
+    // context.put("hdfs.path", testPath + "/%Y-%m-%d/%H");
+    context.put("hdfs.path", newPath);
+    context.put("hdfs.filePrefix", fileName);
+    context.put("hdfs.rollCount", String.valueOf(rollCount));
+    context.put("hdfs.batchSize", String.valueOf(batchSize));
+    context.put("hdfs.writeFormat", "Text");
+    context.put("hdfs.fileType", "DataStream");
+
+    Configurables.configure(sink, context);
+
+    Channel channel = new MemoryChannel();
+    Configurables.configure(channel, context);
+
+    sink.setChannel(channel);
+    sink.start();
+
+    Calendar eventDate = Calendar.getInstance();
+    List<String> bodies = Lists.newArrayList();
+
+    // push the event batches into channel
+    for (i = 1; i < 4; i++) {
+      Transaction txn = channel.getTransaction();
+      txn.begin();
+      for (j = 1; j <= batchSize; j++) {
+        Event event = new SimpleEvent();
+        eventDate.clear();
+        eventDate.set(2011, i, i, i, 0); // yy mm dd
+        event.getHeaders().put("timestamp",
+            String.valueOf(eventDate.getTimeInMillis()));
+        event.getHeaders().put("hostname", "Host" + i);
+        String body = "Test." + i + "." + j;
+        event.setBody(body.getBytes());
+        bodies.add(body);
+        channel.put(event);
+        totalEvents++;
+      }
+      txn.commit();
+      txn.close();
+
+      // execute sink to process the events
+      sink.process();
+    }
+
+    sink.stop();
+
+    // loop through all the files generated and check their contains
+    FileStatus[] dirStat = fs.listStatus(dirPath);
+    Path[] fList = FileUtil.stat2Paths(dirStat);
+
+    // check that the roll happened correctly for the given data
+    long expectedFiles = totalEvents / rollCount;
+    if (totalEvents % rollCount > 0) expectedFiles++;
+    Assert.assertEquals("num files wrong, found: " +
+        Lists.newArrayList(fList), expectedFiles, fList.length);
+    verifyOutputTextFiles(fs, conf, dirPath.toUri().getPath(), fileName, bodies);
+  }
+
+  @Test
+  public void testAvroAppend() throws InterruptedException, LifecycleException,
+      EventDeliveryException, IOException {
+
+    LOG.debug("Starting...");
+    final long rollCount = 3;
+    final long batchSize = 2;
+    final String fileName = "FlumeData";
+    String newPath = testPath + "/singleTextBucket";
+    int totalEvents = 0;
+    int i = 1, j = 1;
+
+    // clear the test directory
+    Configuration conf = new Configuration();
+    FileSystem fs = FileSystem.get(conf);
+    Path dirPath = new Path(newPath);
+    fs.delete(dirPath, true);
+    fs.mkdirs(dirPath);
+
+    Context context = new Context();
+
+    // context.put("hdfs.path", testPath + "/%Y-%m-%d/%H");
+    context.put("hdfs.path", newPath);
+    context.put("hdfs.filePrefix", fileName);
+    context.put("hdfs.rollCount", String.valueOf(rollCount));
+    context.put("hdfs.batchSize", String.valueOf(batchSize));
+    context.put("hdfs.writeFormat", "Text");
+    context.put("hdfs.fileType", "DataStream");
+    context.put("serializer", "AVRO_EVENT");
+
+    Configurables.configure(sink, context);
+
+    Channel channel = new MemoryChannel();
+    Configurables.configure(channel, context);
+
+    sink.setChannel(channel);
+    sink.start();
+
+    Calendar eventDate = Calendar.getInstance();
+    List<String> bodies = Lists.newArrayList();
+
+    // push the event batches into channel
+    for (i = 1; i < 4; i++) {
+      Transaction txn = channel.getTransaction();
+      txn.begin();
+      for (j = 1; j <= batchSize; j++) {
+        Event event = new SimpleEvent();
+        eventDate.clear();
+        eventDate.set(2011, i, i, i, 0); // yy mm dd
+        event.getHeaders().put("timestamp",
+            String.valueOf(eventDate.getTimeInMillis()));
+        event.getHeaders().put("hostname", "Host" + i);
+        String body = "Test." + i + "." + j;
+        event.setBody(body.getBytes());
+        bodies.add(body);
+        channel.put(event);
+        totalEvents++;
+      }
+      txn.commit();
+      txn.close();
+
+      // execute sink to process the events
+      sink.process();
+    }
+
+    sink.stop();
+
+    // loop through all the files generated and check their contains
+    FileStatus[] dirStat = fs.listStatus(dirPath);
+    Path[] fList = FileUtil.stat2Paths(dirStat);
+
+    // check that the roll happened correctly for the given data
+    long expectedFiles = totalEvents / rollCount;
+    if (totalEvents % rollCount > 0) expectedFiles++;
+    Assert.assertEquals("num files wrong, found: " +
+        Lists.newArrayList(fList), expectedFiles, fList.length);
+    verifyOutputAvroFiles(fs, conf, dirPath.toUri().getPath(), fileName, bodies);
+  }
+
+  @Test
+  public void testSimpleAppend() throws InterruptedException,
+      LifecycleException, EventDeliveryException, IOException {
+
+    LOG.debug("Starting...");
+    final String fileName = "FlumeData";
+    final long rollCount = 5;
+    final long batchSize = 2;
+    final int numBatches = 4;
+    String newPath = testPath + "/singleBucket";
+    int totalEvents = 0;
+    int i = 1, j = 1;
+
+    // clear the test directory
+    Configuration conf = new Configuration();
+    FileSystem fs = FileSystem.get(conf);
+    Path dirPath = new Path(newPath);
+    fs.delete(dirPath, true);
+    fs.mkdirs(dirPath);
+
+    Context context = new Context();
+
+    context.put("hdfs.path", newPath);
+    context.put("hdfs.filePrefix", fileName);
+    context.put("hdfs.rollCount", String.valueOf(rollCount));
+    context.put("hdfs.batchSize", String.valueOf(batchSize));
+
+    Configurables.configure(sink, context);
+
+    Channel channel = new MemoryChannel();
+    Configurables.configure(channel, context);
+
+    sink.setChannel(channel);
+    sink.start();
+
+    Calendar eventDate = Calendar.getInstance();
+    List<String> bodies = Lists.newArrayList();
+
+    // push the event batches into channel
+    for (i = 1; i < numBatches; i++) {
+      Transaction txn = channel.getTransaction();
+      txn.begin();
+      for (j = 1; j <= batchSize; j++) {
+        Event event = new SimpleEvent();
+        eventDate.clear();
+        eventDate.set(2011, i, i, i, 0); // yy mm dd
+        event.getHeaders().put("timestamp",
+            String.valueOf(eventDate.getTimeInMillis()));
+        event.getHeaders().put("hostname", "Host" + i);
+        String body = "Test." + i + "." + j;
+        event.setBody(body.getBytes());
+        bodies.add(body);
+        channel.put(event);
+        totalEvents++;
+      }
+      txn.commit();
+      txn.close();
+
+      // execute sink to process the events
+      sink.process();
+    }
+
+    sink.stop();
+
+    // loop through all the files generated and check their contains
+    FileStatus[] dirStat = fs.listStatus(dirPath);
+    Path[] fList = FileUtil.stat2Paths(dirStat);
+
+    // check that the roll happened correctly for the given data
+    long expectedFiles = totalEvents / rollCount;
+    if (totalEvents % rollCount > 0) expectedFiles++;
+    Assert.assertEquals("num files wrong, found: " +
+        Lists.newArrayList(fList), expectedFiles, fList.length);
+    verifyOutputSequenceFiles(fs, conf, dirPath.toUri().getPath(), fileName, bodies);
+  }
+
+  @Test
+  public void testSimpleAppendLocalTime()
+      throws InterruptedException, LifecycleException, EventDeliveryException, IOException {
+    final long currentTime = System.currentTimeMillis();
+    Clock clk = new Clock() {
+      @Override
+      public long currentTimeMillis() {
+        return currentTime;
+      }
+    };
+
+    LOG.debug("Starting...");
+    final String fileName = "FlumeData";
+    final long rollCount = 5;
+    final long batchSize = 2;
+    final int numBatches = 4;
+    String newPath = testPath + "/singleBucket/%s" ;
+    String expectedPath = testPath + "/singleBucket/" +
+        String.valueOf(currentTime / 1000);
+    int totalEvents = 0;
+    int i = 1, j = 1;
+
+    // clear the test directory
+    Configuration conf = new Configuration();
+    FileSystem fs = FileSystem.get(conf);
+    Path dirPath = new Path(expectedPath);
+    fs.delete(dirPath, true);
+    fs.mkdirs(dirPath);
+
+    Context context = new Context();
+
+    context.put("hdfs.path", newPath);
+    context.put("hdfs.filePrefix", fileName);
+    context.put("hdfs.rollCount", String.valueOf(rollCount));
+    context.put("hdfs.batchSize", String.valueOf(batchSize));
+    context.put("hdfs.useLocalTimeStamp", String.valueOf(true));
+
+    Configurables.configure(sink, context);
+
+    Channel channel = new MemoryChannel();
+    Configurables.configure(channel, context);
+
+    sink.setChannel(channel);
+    sink.setBucketClock(clk);
+    sink.start();
+
+    Calendar eventDate = Calendar.getInstance();
+    List<String> bodies = Lists.newArrayList();
+
+    // push the event batches into channel
+    for (i = 1; i < numBatches; i++) {
+      Transaction txn = channel.getTransaction();
+      txn.begin();
+      for (j = 1; j <= batchSize; j++) {
+        Event event = new SimpleEvent();
+        eventDate.clear();
+        eventDate.set(2011, i, i, i, 0); // yy mm dd
+        event.getHeaders().put("timestamp",
+            String.valueOf(eventDate.getTimeInMillis()));
+        event.getHeaders().put("hostname", "Host" + i);
+        String body = "Test." + i + "." + j;
+        event.setBody(body.getBytes());
+        bodies.add(body);
+        channel.put(event);
+        totalEvents++;
+      }
+      txn.commit();
+      txn.close();
+
+      // execute sink to process the events
+      sink.process();
+    }
+
+    sink.stop();
+
+    // loop through all the files generated and check their contains
+    FileStatus[] dirStat = fs.listStatus(dirPath);
+    Path[] fList = FileUtil.stat2Paths(dirStat);
+
+    // check that the roll happened correctly for the given data
+    long expectedFiles = totalEvents / rollCount;
+    if (totalEvents % rollCount > 0) expectedFiles++;
+    Assert.assertEquals("num files wrong, found: " +
+        Lists.newArrayList(fList), expectedFiles, fList.length);
+    verifyOutputSequenceFiles(fs, conf, dirPath.toUri().getPath(), fileName, bodies);
+    // The clock in bucketpath is static, so restore the real clock
+    sink.setBucketClock(new SystemClock());
+  }
+
+  @Test
+  public void testAppend() throws InterruptedException, LifecycleException,
+      EventDeliveryException, IOException {
+
+    LOG.debug("Starting...");
+    final long rollCount = 3;
+    final long batchSize = 2;
+    final String fileName = "FlumeData";
+
+    // clear the test directory
+    Configuration conf = new Configuration();
+    FileSystem fs = FileSystem.get(conf);
+    Path dirPath = new Path(testPath);
+    fs.delete(dirPath, true);
+    fs.mkdirs(dirPath);
+
+    Context context = new Context();
+
+    context.put("hdfs.path", testPath + "/%Y-%m-%d/%H");
+    context.put("hdfs.timeZone", "UTC");
+    context.put("hdfs.filePrefix", fileName);
+    context.put("hdfs.rollCount", String.valueOf(rollCount));
+    context.put("hdfs.batchSize", String.valueOf(batchSize));
+
+    Configurables.configure(sink, context);
+
+    Channel channel = new MemoryChannel();
+    Configurables.configure(channel, context);
+
+    sink.setChannel(channel);
+    sink.start();
+
+    Calendar eventDate = Calendar.getInstance();
+    List<String> bodies = Lists.newArrayList();
+    // push the event batches into channel
+    for (int i = 1; i < 4; i++) {
+      Transaction txn = channel.getTransaction();
+      txn.begin();
+      for (int j = 1; j <= batchSize; j++) {
+        Event event = new SimpleEvent();
+        eventDate.clear();
+        eventDate.set(2011, i, i, i, 0); // yy mm dd
+        event.getHeaders().put("timestamp",
+            String.valueOf(eventDate.getTimeInMillis()));
+        event.getHeaders().put("hostname", "Host" + i);
+        String body = "Test." + i + "." + j;
+        event.setBody(body.getBytes());
+        bodies.add(body);
+        channel.put(event);
+      }
+      txn.commit();
+      txn.close();
+
+      // execute sink to process the events
+      sink.process();
+    }
+
+    sink.stop();
+    verifyOutputSequenceFiles(fs, conf, dirPath.toUri().getPath(), fileName, bodies);
+  }
+
+  // inject fault and make sure that the txn is rolled back and retried
+  @Test
+  public void testBadSimpleAppend() throws InterruptedException,
+      LifecycleException, EventDeliveryException, IOException {
+
+    LOG.debug("Starting...");
+    final String fileName = "FlumeData";
+    final long rollCount = 5;
+    final long batchSize = 2;
+    final int numBatches = 4;
+    String newPath = testPath + "/singleBucket";
+    int totalEvents = 0;
+    int i = 1, j = 1;
+
+    HDFSTestWriterFactory badWriterFactory = new HDFSTestWriterFactory();
+    sink = new HDFSEventSink(badWriterFactory);
+
+    // clear the test directory
+    Configuration conf = new Configuration();
+    FileSystem fs = FileSystem.get(conf);
+    Path dirPath = new Path(newPath);
+    fs.delete(dirPath, true);
+    fs.mkdirs(dirPath);
+
+    Context context = new Context();
+
+    context.put("hdfs.path", newPath);
+    context.put("hdfs.filePrefix", fileName);
+    context.put("hdfs.rollCount", String.valueOf(rollCount));
+    context.put("hdfs.batchSize", String.valueOf(batchSize));
+    context.put("hdfs.fileType", HDFSTestWriterFactory.TestSequenceFileType);
+
+    Configurables.configure(sink, context);
+
+    Channel channel = new MemoryChannel();
+    Configurables.configure(channel, context);
+
+    sink.setChannel(channel);
+    sink.start();
+
+    Calendar eventDate = Calendar.getInstance();
+
+    List<String> bodies = Lists.newArrayList();
+    // push the event batches into channel
+    for (i = 1; i < numBatches; i++) {
+      Transaction txn = channel.getTransaction();
+      txn.begin();
+      for (j = 1; j <= batchSize; j++) {
+        Event event = new SimpleEvent();
+        eventDate.clear();
+        eventDate.set(2011, i, i, i, 0); // yy mm dd
+        event.getHeaders().put("timestamp",
+            String.valueOf(eventDate.getTimeInMillis()));
+        event.getHeaders().put("hostname", "Host" + i);
+
+        String body = "Test." + i + "." + j;
+        event.setBody(body.getBytes());
+        bodies.add(body);
+        // inject fault
+        if ((totalEvents % 30) == 1) {
+          event.getHeaders().put("fault-once", "");
+        }
+        channel.put(event);
+        totalEvents++;
+      }
+      txn.commit();
+      txn.close();
+
+      LOG.info("Process events: " + sink.process());
+    }
+    LOG.info("Process events to end of transaction max: " + sink.process());
+    LOG.info("Process events to injected fault: " + sink.process());
+    LOG.info("Process events remaining events: " + sink.process());
+    sink.stop();
+    verifyOutputSequenceFiles(fs, conf, dirPath.toUri().getPath(), fileName, bodies);
+
+  }
+
+
+  private List<String> getAllFiles(String input) {
+    List<String> output = Lists.newArrayList();
+    File dir = new File(input);
+    if (dir.isFile()) {
+      output.add(dir.getAbsolutePath());
+    } else if (dir.isDirectory()) {
+      for (String file : dir.list()) {
+        File subDir = new File(dir, file);
+        output.addAll(getAllFiles(subDir.getAbsolutePath()));
+      }
+    }
+    return output;
+  }
+
+  private void verifyOutputSequenceFiles(FileSystem fs, Configuration conf, String dir,
+                                         String prefix, List<String> bodies) throws IOException {
+    int found = 0;
+    int expected = bodies.size();
+    for (String outputFile : getAllFiles(dir)) {
+      String name = (new File(outputFile)).getName();
+      if (name.startsWith(prefix)) {
+        SequenceFile.Reader reader = new SequenceFile.Reader(fs, new Path(outputFile), conf);
+        LongWritable key = new LongWritable();
+        BytesWritable value = new BytesWritable();
+        while (reader.next(key, value)) {
+          String body = new String(value.getBytes(), 0, value.getLength());
+          if (bodies.contains(body)) {
+            LOG.debug("Found event body: {}", body);
+            bodies.remove(body);
+            found++;
+          }
+        }
+        reader.close();
+      }
+    }
+    if (!bodies.isEmpty()) {
+      for (String body : bodies) {
+        LOG.error("Never found event body: {}", body);
+      }
+    }
+    Assert.assertTrue("Found = " + found + ", Expected = "  +
+        expected + ", Left = " + bodies.size() + " " + bodies,
+          bodies.size() == 0);
+
+  }
+
+  private void verifyOutputTextFiles(FileSystem fs, Configuration conf, String dir, String prefix,
+                                     List<String> bodies) throws IOException {
+    int found = 0;
+    int expected = bodies.size();
+    for (String outputFile : getAllFiles(dir)) {
+      String name = (new File(outputFile)).getName();
+      if (name.startsWith(prefix)) {
+        FSDataInputStream input = fs.open(new Path(outputFile));
+        BufferedReader reader = new BufferedReader(new InputStreamReader(input));
+        String body = null;
+        while ((body = reader.readLine()) != null) {
+          bodies.remove(body);
+          found++;
+        }
+        reader.close();
+      }
+    }
+    Assert.assertTrue("Found = " + found + ", Expected = "  +
+        expected + ", Left = " + bodies.size() + " " + bodies,
+          bodies.size() == 0);
+
+  }
+
+  private void verifyOutputAvroFiles(FileSystem fs, Configuration conf, String dir, String prefix,
+                                     List<String> bodies) throws IOException {
+    int found = 0;
+    int expected = bodies.size();
+    for (String outputFile : getAllFiles(dir)) {
+      String name = (new File(outputFile)).getName();
+      if (name.startsWith(prefix)) {
+        FSDataInputStream input = fs.open(new Path(outputFile));
+        DatumReader<GenericRecord> reader = new GenericDatumReader<GenericRecord>();
+        DataFileStream<GenericRecord> avroStream =
+            new DataFileStream<GenericRecord>(input, reader);
+        GenericRecord record = new GenericData.Record(avroStream.getSchema());
+        while (avroStream.hasNext()) {
+          avroStream.next(record);
+          ByteBuffer body = (ByteBuffer) record.get("body");
+          CharsetDecoder decoder = Charsets.UTF_8.newDecoder();
+          String bodyStr = decoder.decode(body).toString();
+          LOG.debug("Removing event: {}", bodyStr);
+          bodies.remove(bodyStr);
+          found++;
+        }
+        avroStream.close();
+        input.close();
+      }
+    }
+    Assert.assertTrue("Found = " + found + ", Expected = "  +
+        expected + ", Left = " + bodies.size() + " " + bodies,
+            bodies.size() == 0);
+  }
+
+  /**
+   * Ensure that when a write throws an IOException we are
+   * able to continue to progress in the next process() call.
+   * This relies on Transactional rollback semantics for durability and
+   * the behavior of the BucketWriter class of close()ing upon IOException.
+   */
+  @Test
+  public void testCloseReopen()
+      throws InterruptedException, LifecycleException, EventDeliveryException, IOException {
+
+    LOG.debug("Starting...");
+    final int numBatches = 4;
+    final String fileName = "FlumeData";
+    final long rollCount = 5;
+    final long batchSize = 2;
+    String newPath = testPath + "/singleBucket";
+    int i = 1, j = 1;
+
+    HDFSTestWriterFactory badWriterFactory = new HDFSTestWriterFactory();
+    sink = new HDFSEventSink(badWriterFactory);
+
+    // clear the test directory
+    Configuration conf = new Configuration();
+    FileSystem fs = FileSystem.get(conf);
+    Path dirPath = new Path(newPath);
+    fs.delete(dirPath, true);
+    fs.mkdirs(dirPath);
+
+    Context context = new Context();
+
+    context.put("hdfs.path", newPath);
+    context.put("hdfs.filePrefix", fileName);
+    context.put("hdfs.rollCount", String.valueOf(rollCount));
+    context.put("hdfs.batchSize", String.valueOf(batchSize));
+    context.put("hdfs.fileType", HDFSTestWriterFactory.TestSequenceFileType);
+
+    Configurables.configure(sink, context);
+
+    MemoryChannel channel = new MemoryChannel();
+    Configurables.configure(channel, new Context());
+
+    sink.setChannel(channel);
+    sink.start();
+
+    Calendar eventDate = Calendar.getInstance();
+    List<String> bodies = Lists.newArrayList();
+    // push the event batches into channel
+    for (i = 1; i < numBatches; i++) {
+      channel.getTransaction().begin();
+      try {
+        for (j = 1; j <= batchSize; j++) {
+          Event event = new SimpleEvent();
+          eventDate.clear();
+          eventDate.set(2011, i, i, i, 0); // yy mm dd
+          event.getHeaders().put("timestamp",
+              String.valueOf(eventDate.getTimeInMillis()));
+          event.getHeaders().put("hostname", "Host" + i);
+          String body = "Test." + i + "." + j;
+          event.setBody(body.getBytes());
+          bodies.add(body);
+          // inject fault
+          event.getHeaders().put("fault-until-reopen", "");
+          channel.put(event);
+        }
+        channel.getTransaction().commit();
+      } finally {
+        channel.getTransaction().close();
+      }
+      LOG.info("execute sink to process the events: " + sink.process());
+    }
+    LOG.info("clear any events pending due to errors: " + sink.process());
+    sink.stop();
+
+    verifyOutputSequenceFiles(fs, conf, dirPath.toUri().getPath(), fileName, bodies);
+  }
+
+  /**
+   * Test that the old bucket writer is closed at the end of rollInterval and
+   * a new one is used for the next set of events.
+   */
+  @Test
+  public void testCloseReopenOnRollTime()
+      throws InterruptedException, LifecycleException, EventDeliveryException, IOException {
+
+    LOG.debug("Starting...");
+    final int numBatches = 4;
+    final String fileName = "FlumeData";
+    final long batchSize = 2;
+    String newPath = testPath + "/singleBucket";
+    int i = 1, j = 1;
+
+    HDFSTestWriterFactory badWriterFactory = new HDFSTestWriterFactory();
+    sink = new HDFSEventSink(badWriterFactory);
+
+    // clear the test directory
+    Configuration conf = new Configuration();
+    FileSystem fs = FileSystem.get(conf);
+    Path dirPath = new Path(newPath);
+    fs.delete(dirPath, true);
+    fs.mkdirs(dirPath);
+
+    Context context = new Context();
+
+    context.put("hdfs.path", newPath);
+    context.put("hdfs.filePrefix", fileName);
+    context.put("hdfs.rollCount", String.valueOf(0));
+    context.put("hdfs.rollSize", String.valueOf(0));
+    context.put("hdfs.rollInterval", String.valueOf(2));
+    context.put("hdfs.batchSize", String.valueOf(batchSize));
+    context.put("hdfs.fileType", HDFSTestWriterFactory.TestSequenceFileType);
+
+    Configurables.configure(sink, context);
+
+    MemoryChannel channel = new MemoryChannel();
+    Configurables.configure(channel, new Context());
+
+    sink.setChannel(channel);
+    sink.start();
+
+    Calendar eventDate = Calendar.getInstance();
+    List<String> bodies = Lists.newArrayList();
+    // push the event batches into channel
+    for (i = 1; i < numBatches; i++) {
+      channel.getTransaction().begin();
+      try {
+        for (j = 1; j <= batchSize; j++) {
+          Event event = new SimpleEvent();
+          eventDate.clear();
+          eventDate.set(2011, i, i, i, 0); // yy mm dd
+          event.getHeaders().put("timestamp",
+              String.valueOf(eventDate.getTimeInMillis()));
+          event.getHeaders().put("hostname", "Host" + i);
+          String body = "Test." + i + "." + j;
+          event.setBody(body.getBytes());
+          bodies.add(body);
+          // inject fault
+          event.getHeaders().put("count-check", "");
+          channel.put(event);
+        }
+        channel.getTransaction().commit();
+      } finally {
+        channel.getTransaction().close();
+      }
+      LOG.info("execute sink to process the events: " + sink.process());
+      // Make sure the first file gets rolled due to rollTimeout.
+      if (i == 1) {
+        Thread.sleep(2001);
+      }
+    }
+    LOG.info("clear any events pending due to errors: " + sink.process());
+    sink.stop();
+
+    Assert.assertTrue(badWriterFactory.openCount.get() >= 2);
+    LOG.info("Total number of bucket writers opened: {}",
+        badWriterFactory.openCount.get());
+    verifyOutputSequenceFiles(fs, conf, dirPath.toUri().getPath(), fileName,
+        bodies);
+  }
+
+  /**
+   * Test that a close due to roll interval removes the bucketwriter from
+   * sfWriters map.
+   */
+  @Test
+  public void testCloseRemovesFromSFWriters()
+      throws InterruptedException, LifecycleException, EventDeliveryException, IOException {
+
+    LOG.debug("Starting...");
+    final String fileName = "FlumeData";
+    final long batchSize = 2;
+    String newPath = testPath + "/singleBucket";
+    int i = 1, j = 1;
+
+    HDFSTestWriterFactory badWriterFactory = new HDFSTestWriterFactory();
+    sink = new HDFSEventSink(badWriterFactory);
+
+    // clear the test directory
+    Configuration conf = new Configuration();
+    FileSystem fs = FileSystem.get(conf);
+    Path dirPath = new Path(newPath);
+    fs.delete(dirPath, true);
+    fs.mkdirs(dirPath);
+
+    Context context = new Context();
+
+    context.put("hdfs.path", newPath);
+    context.put("hdfs.filePrefix", fileName);
+    context.put("hdfs.rollCount", String.valueOf(0));
+    context.put("hdfs.rollSize", String.valueOf(0));
+    context.put("hdfs.rollInterval", String.valueOf(1));
+    context.put("hdfs.batchSize", String.valueOf(batchSize));
+    context.put("hdfs.fileType", HDFSTestWriterFactory.TestSequenceFileType);
+    String expectedLookupPath = newPath + "/FlumeData";
+
+    Configurables.configure(sink, context);
+
+    MemoryChannel channel = new MemoryChannel();
+    Configurables.configure(channel, new Context());
+
+    sink.setChannel(channel);
+    sink.start();
+
+    Calendar eventDate = Calendar.getInstance();
+    List<String> bodies = Lists.newArrayList();
+    // push the event batches into channel
+    channel.getTransaction().begin();
+    try {
+      for (j = 1; j <= 2 * batchSize; j++) {
+        Event event = new SimpleEvent();
+        eventDate.clear();
+        eventDate.set(2011, i, i, i, 0); // yy mm dd
+        event.getHeaders().put("timestamp",
+            String.valueOf(eventDate.getTimeInMillis()));
+        event.getHeaders().put("hostname", "Host" + i);
+        String body = "Test." + i + "." + j;
+        event.setBody(body.getBytes());
+        bodies.add(body);
+        // inject fault
+        event.getHeaders().put("count-check", "");
+        channel.put(event);
+      }
+      channel.getTransaction().commit();
+    } finally {
+      channel.getTransaction().close();
+    }
+    LOG.info("execute sink to process the events: " + sink.process());
+    Assert.assertTrue(sink.getSfWriters().containsKey(expectedLookupPath));
+    // Make sure the first file gets rolled due to rollTimeout.
+    Thread.sleep(2001);
+    Assert.assertFalse(sink.getSfWriters().containsKey(expectedLookupPath));
+    LOG.info("execute sink to process the events: " + sink.process());
+    // A new bucket writer should have been created for this bucket. So
+    // sfWriters map should not have the same key again.
+    Assert.assertTrue(sink.getSfWriters().containsKey(expectedLookupPath));
+    sink.stop();
+
+    LOG.info("Total number of bucket writers opened: {}",
+        badWriterFactory.openCount.get());
+    verifyOutputSequenceFiles(fs, conf, dirPath.toUri().getPath(), fileName,
+        bodies);
+  }
+
+
+
+  /*
+   * append using slow sink writer.
+   * verify that the process returns backoff due to timeout
+   */
+  @Test
+  public void testSlowAppendFailure() throws InterruptedException,
+      LifecycleException, EventDeliveryException, IOException {
+
+    LOG.debug("Starting...");
+    final String fileName = "FlumeData";
+    final long rollCount = 5;
+    final long batchSize = 2;
+    final int numBatches = 2;
+    String newPath = testPath + "/singleBucket";
+    int i = 1, j = 1;
+
+    // clear the test directory
+    Configuration conf = new Configuration();
+    FileSystem fs = FileSystem.get(conf);
+    Path dirPath = new Path(newPath);
+    fs.delete(dirPath, true);
+    fs.mkdirs(dirPath);
+
+    // create HDFS sink with slow writer
+    HDFSTestWriterFactory badWriterFactory = new HDFSTestWriterFactory();
+    sink = new HDFSEventSink(badWriterFactory);
+
+    Context context = new Context();
+    context.put("hdfs.path", newPath);
+    context.put("hdfs.filePrefix", fileName);
+    context.put("hdfs.rollCount", String.valueOf(rollCount));
+    context.put("hdfs.batchSize", String.valueOf(batchSize));
+    context.put("hdfs.fileType", HDFSTestWriterFactory.TestSequenceFileType);
+    context.put("hdfs.callTimeout", Long.toString(1000));
+    Configurables.configure(sink, context);
+
+    Channel channel = new MemoryChannel();
+    Configurables.configure(channel, context);
+
+    sink.setChannel(channel);
+    sink.start();
+
+    Calendar eventDate = Calendar.getInstance();
+
+    // push the event batches into channel
+    for (i = 0; i < numBatches; i++) {
+      Transaction txn = channel.getTransaction();
+      txn.begin();
+      for (j = 1; j <= batchSize; j++) {
+        Event event = new SimpleEvent();
+        eventDate.clear();
+        eventDate.set(2011, i, i, i, 0); // yy mm dd
+        event.getHeaders().put("timestamp",
+            String.valueOf(eventDate.getTimeInMillis()));
+        event.getHeaders().put("hostname", "Host" + i);
+        event.getHeaders().put("slow", "1500");
+        event.setBody(("Test." + i + "." + j).getBytes());
+        channel.put(event);
+      }
+      txn.commit();
+      txn.close();
+
+      // execute sink to process the events
+      Status satus = sink.process();
+
+      // verify that the append returned backoff due to timeotu
+      Assert.assertEquals(satus, Status.BACKOFF);
+    }
+
+    sink.stop();
+  }
+
+  /*
+   * append using slow sink writer with specified append timeout
+   * verify that the data is written correctly to files
+   */
+  private void slowAppendTestHelper(long appendTimeout)
+      throws InterruptedException, IOException, LifecycleException, EventDeliveryException,
+             IOException {
+    final String fileName = "FlumeData";
+    final long rollCount = 5;
+    final long batchSize = 2;
+    final int numBatches = 2;
+    String newPath = testPath + "/singleBucket";
+    int totalEvents = 0;
+    int i = 1, j = 1;
+
+    // clear the test directory
+    Configuration conf = new Configuration();
+    FileSystem fs = FileSystem.get(conf);
+    Path dirPath = new Path(newPath);
+    fs.delete(dirPath, true);
+    fs.mkdirs(dirPath);
+
+    // create HDFS sink with slow writer
+    HDFSTestWriterFactory badWriterFactory = new HDFSTestWriterFactory();
+    sink = new HDFSEventSink(badWriterFactory);
+
+    Context context = new Context();
+    context.put("hdfs.path", newPath);
+    context.put("hdfs.filePrefix", fileName);
+    context.put("hdfs.rollCount", String.valueOf(rollCount));
+    context.put("hdfs.batchSize", String.valueOf(batchSize));
+    context.put("hdfs.fileType", HDFSTestWriterFactory.TestSequenceFileType);
+    context.put("hdfs.appendTimeout", String.valueOf(appendTimeout));
+    Configurables.configure(sink, context);
+
+    Channel channel = new MemoryChannel();
+    Configurables.configure(channel, context);
+
+    sink.setChannel(channel);
+    sink.start();
+
+    Calendar eventDate = Calendar.getInstance();
+    List<String> bodies = Lists.newArrayList();
+    // push the event batches into channel
+    for (i = 0; i < numBatches; i++) {
+      Transaction txn = channel.getTransaction();
+      txn.begin();
+      for (j = 1; j <= batchSize; j++) {
+        Event event = new SimpleEvent();
+        eventDate.clear();
+        eventDate.set(2011, i, i, i, 0); // yy mm dd
+        event.getHeaders().put("timestamp",
+            String.valueOf(eventDate.getTimeInMillis()));
+        event.getHeaders().put("hostname", "Host" + i);
+        event.getHeaders().put("slow", "1500");
+        String body = "Test." + i + "." + j;
+        event.setBody(body.getBytes());
+        bodies.add(body);
+        channel.put(event);
+        totalEvents++;
+      }
+      txn.commit();
+      txn.close();
+
+      // execute sink to process the events
+      sink.process();
+    }
+
+    sink.stop();
+
+    // loop through all the files generated and check their contains
+    FileStatus[] dirStat = fs.listStatus(dirPath);
+    Path[] fList = FileUtil.stat2Paths(dirStat);
+
+    // check that the roll happened correctly for the given data
+    // Note that we'll end up with two files with only a head
+    long expectedFiles = totalEvents / rollCount;
+    if (totalEvents % rollCount > 0) expectedFiles++;
+    Assert.assertEquals("num files wrong, found: " +
+        Lists.newArrayList(fList), expectedFiles, fList.length);
+    verifyOutputSequenceFiles(fs, conf, dirPath.toUri().getPath(), fileName, bodies);
+  }
+
+  /*
+   * append using slow sink writer with long append timeout
+   * verify that the data is written correctly to files
+   */
+  @Test
+  public void testSlowAppendWithLongTimeout() throws InterruptedException,
+      LifecycleException, EventDeliveryException, IOException {
+    LOG.debug("Starting...");
+    slowAppendTestHelper(3000);
+  }
+
+  /*
+   * append using slow sink writer with no timeout to make append
+   * synchronous. Verify that the data is written correctly to files
+   */
+  @Test
+  public void testSlowAppendWithoutTimeout() throws InterruptedException,
+      LifecycleException, EventDeliveryException, IOException {
+    LOG.debug("Starting...");
+    slowAppendTestHelper(0);
+  }
+  @Test
+  public void testCloseOnIdle() throws IOException, EventDeliveryException, InterruptedException {
+    String hdfsPath = testPath + "/idleClose";
+
+    Configuration conf = new Configuration();
+    FileSystem fs = FileSystem.get(conf);
+    Path dirPath = new Path(hdfsPath);
+    fs.delete(dirPath, true);
+    fs.mkdirs(dirPath);
+    Context context = new Context();
+    context.put("hdfs.path", hdfsPath);
+    /*
+     * All three rolling methods are disabled so the only
+     * way a file can roll is through the idle timeout.
+     */
+    context.put("hdfs.rollCount", "0");
+    context.put("hdfs.rollSize", "0");
+    context.put("hdfs.rollInterval", "0");
+    context.put("hdfs.batchSize", "2");
+    context.put("hdfs.idleTimeout", "1");
+    Configurables.configure(sink, context);
+
+    Channel channel = new MemoryChannel();
+    Configurables.configure(channel, context);
+
+    sink.setChannel(channel);
+    sink.start();
+
+    Transaction txn = channel.getTransaction();
+    txn.begin();
+    for (int i = 0; i < 10; i++) {
+      Event event = new SimpleEvent();
+      event.setBody(("test event " + i).getBytes());
+      channel.put(event);
+    }
+    txn.commit();
+    txn.close();
+
+    sink.process();
+    sink.process();
+    Thread.sleep(1001);
+    // previous file should have timed out now
+    // this can throw BucketClosedException(from the bucketWriter having
+    // closed),this is not an issue as the sink will retry and get a fresh
+    // bucketWriter so long as the onClose handler properly removes
+    // bucket writers that were closed.
+    sink.process();
+    sink.process();
+    Thread.sleep(500); // shouldn't be enough for a timeout to occur
+    sink.process();
+    sink.process();
+    sink.stop();
+    FileStatus[] dirStat = fs.listStatus(dirPath);
+    Path[] fList = FileUtil.stat2Paths(dirStat);
+    Assert.assertEquals("Incorrect content of the directory " + StringUtils.join(fList, ","),
+                        2, fList.length);
+    Assert.assertTrue(!fList[0].getName().endsWith(".tmp") &&
+                      !fList[1].getName().endsWith(".tmp"));
+    fs.close();
+  }
+
+  /**
+   * This test simulates what happens when a batch of events is written to a compressed sequence
+   * file (and thus hsync'd to hdfs) but the file is not yet closed.
+   *
+   * When this happens, the data that we wrote should still be readable.
+   */
+  @Test
+  public void testBlockCompressSequenceFileWriterSync() throws IOException, EventDeliveryException {
+    String hdfsPath = testPath + "/sequenceFileWriterSync";
+    FileSystem fs = FileSystem.get(new Configuration());
+    // Since we are reading a partial file we don't want to use checksums
+    fs.setVerifyChecksum(false);
+    fs.setWriteChecksum(false);
+
+    // Compression codecs that don't require native hadoop libraries
+    String [] codecs = {"BZip2Codec", "DeflateCodec"};
+
+    for (String codec : codecs) {
+      sequenceFileWriteAndVerifyEvents(fs, hdfsPath, codec, Collections.singletonList(
+          "single-event"
+      ));
+
+      sequenceFileWriteAndVerifyEvents(fs, hdfsPath, codec, Arrays.asList(
+          "multiple-events-1",
+          "multiple-events-2",
+          "multiple-events-3",
+          "multiple-events-4",
+          "multiple-events-5"
+      ));
+    }
+
+    fs.close();
+  }
+
+  private void sequenceFileWriteAndVerifyEvents(FileSystem fs, String hdfsPath, String codec,
+                                                Collection<String> eventBodies)
+      throws IOException, EventDeliveryException {
+    Path dirPath = new Path(hdfsPath);
+    fs.delete(dirPath, true);
+    fs.mkdirs(dirPath);
+
+    Context context = new Context();
+    context.put("hdfs.path", hdfsPath);
+    // Ensure the file isn't closed and rolled
+    context.put("hdfs.rollCount", String.valueOf(eventBodies.size() + 1));
+    context.put("hdfs.rollSize", "0");
+    context.put("hdfs.rollInterval", "0");
+    context.put("hdfs.batchSize", "1");
+    context.put("hdfs.fileType", "SequenceFile");
+    context.put("hdfs.codeC", codec);
+    context.put("hdfs.writeFormat", "Writable");
+    Configurables.configure(sink, context);
+
+    Channel channel = new MemoryChannel();
+    Configurables.configure(channel, context);
+
+    sink.setChannel(channel);
+    sink.start();
+
+    for (String eventBody : eventBodies) {
+      Transaction txn = channel.getTransaction();
+      txn.begin();
+
+      Event event = new SimpleEvent();
+      event.setBody(eventBody.getBytes());
+      channel.put(event);
+
+      txn.commit();
+      txn.close();
+
+      sink.process();
+    }
+
+    // Sink is _not_ closed.  The file should remain open but
+    // the data written should be visible to readers via sync + hflush
+    FileStatus[] dirStat = fs.listStatus(dirPath);
+    Path[] paths = FileUtil.stat2Paths(dirStat);
+
+    Assert.assertEquals(1, paths.length);
+
+    SequenceFile.Reader reader =
+        new SequenceFile.Reader(fs.getConf(), SequenceFile.Reader.stream(fs.open(paths[0])));
+    LongWritable key = new LongWritable();
+    BytesWritable value = new BytesWritable();
+
+    for (String eventBody : eventBodies) {
+      Assert.assertTrue(reader.next(key, value));
+      Assert.assertArrayEquals(eventBody.getBytes(), value.copyBytes());
+    }
+
+    Assert.assertFalse(reader.next(key, value));
+  }
+
+  private Context getContextForRetryTests() {
+    Context context = new Context();
+
+    context.put("hdfs.path", testPath + "/%{retryHeader}");
+    context.put("hdfs.filePrefix", "test");
+    context.put("hdfs.batchSize", String.valueOf(100));
+    context.put("hdfs.fileType", "DataStream");
+    context.put("hdfs.serializer", "text");
+    context.put("hdfs.closeTries","3");
+    context.put("hdfs.rollCount", "1");
+    context.put("hdfs.retryInterval", "1");
+    return context;
+  }
+
+  @Test
+  public void testBadConfigurationForRetryIntervalZero() throws Exception {
+    Context context = getContextForRetryTests();
+    context.put("hdfs.retryInterval", "0");
+
+    Configurables.configure(sink, context);
+    Assert.assertEquals(1, sink.getTryCount());
+  }
+
+  @Test
+  public void testBadConfigurationForRetryIntervalNegative() throws Exception {
+    Context context = getContextForRetryTests();
+    context.put("hdfs.retryInterval", "-1");
+
+    Configurables.configure(sink, context);
+    Assert.assertEquals(1, sink.getTryCount());
+  }
+
+  @Test
+  public void testBadConfigurationForRetryCountZero() throws Exception {
+    Context context = getContextForRetryTests();
+    context.put("hdfs.closeTries" ,"0");
+
+    Configurables.configure(sink, context);
+    Assert.assertEquals(Integer.MAX_VALUE, sink.getTryCount());
+  }
+
+  @Test
+  public void testBadConfigurationForRetryCountNegative() throws Exception {
+    Context context = getContextForRetryTests();
+    context.put("hdfs.closeTries" ,"-4");
+
+    Configurables.configure(sink, context);
+    Assert.assertEquals(Integer.MAX_VALUE, sink.getTryCount());
+  }
+
+  @Test
+  public void testRetryRename()
+      throws InterruptedException, LifecycleException, EventDeliveryException, IOException {
+    testRetryRename(true);
+    testRetryRename(false);
+  }
+
+  private void testRetryRename(boolean closeSucceed)
+      throws InterruptedException, LifecycleException, EventDeliveryException, IOException {
+    LOG.debug("Starting...");
+    String newPath = testPath + "/retryBucket";
+
+    // clear the test directory
+    Configuration conf = new Configuration();
+    FileSystem fs = FileSystem.get(conf);
+    Path dirPath = new Path(newPath);
+    fs.delete(dirPath, true);
+    fs.mkdirs(dirPath);
+    MockFileSystem mockFs = new MockFileSystem(fs, 6, closeSucceed);
+
+    Context context = getContextForRetryTests();
+    Configurables.configure(sink, context);
+
+    Channel channel = new MemoryChannel();
+    Configurables.configure(channel, context);
+
+    sink.setChannel(channel);
+    sink.setMockFs(mockFs);
+    HDFSWriter hdfsWriter = new MockDataStream(mockFs);
+    hdfsWriter.configure(context);
+    sink.setMockWriter(hdfsWriter);
+    sink.start();
+
+    // push the event batches into channel
+    for (int i = 0; i < 2; i++) {
+      Transaction txn = channel.getTransaction();
+      txn.begin();
+      Map<String, String> hdr = Maps.newHashMap();
+      hdr.put("retryHeader", "v1");
+
+      channel.put(EventBuilder.withBody("random".getBytes(), hdr));
+      txn.commit();
+      txn.close();
+
+      // execute sink to process the events
+      sink.process();
+    }
+    // push the event batches into channel
+    for (int i = 0; i < 2; i++) {
+      Transaction txn = channel.getTransaction();
+      txn.begin();
+      Map<String, String> hdr = Maps.newHashMap();
+      hdr.put("retryHeader", "v2");
+      channel.put(EventBuilder.withBody("random".getBytes(), hdr));
+      txn.commit();
+      txn.close();
+      // execute sink to process the events
+      sink.process();
+    }
+
+    TimeUnit.SECONDS.sleep(5); //Sleep till all retries are done.
+
+    Collection<BucketWriter> writers = sink.getSfWriters().values();
+
+    int totalRenameAttempts = 0;
+    for (BucketWriter writer : writers) {
+      LOG.info("Rename tries = " + writer.renameTries.get());
+      totalRenameAttempts += writer.renameTries.get();
+    }
+    // stop clears the sfWriters map, so we need to compute the
+    // close tries count before stopping the sink.
+    sink.stop();
+    Assert.assertEquals(6, totalRenameAttempts);
+
+  }
+}
diff --git a/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSinkOnMiniCluster.java b/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSinkOnMiniCluster.java
new file mode 100644
index 0000000..7c1caaa
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSinkOnMiniCluster.java
@@ -0,0 +1,486 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flume.sink.hdfs;
+
+import com.google.common.base.Charsets;
+import java.io.BufferedReader;
+import java.io.File;
+import java.io.IOException;
+import java.io.InputStreamReader;
+import java.util.zip.GZIPInputStream;
+import org.apache.commons.io.FileUtils;
+import org.apache.flume.Context;
+import org.apache.flume.EventDeliveryException;
+import org.apache.flume.channel.MemoryChannel;
+import org.apache.flume.event.EventBuilder;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.fs.FileStatus;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hdfs.MiniDFSCluster;
+import org.junit.After;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.Ignore;
+import org.junit.Test;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Unit tests that exercise HDFSEventSink on an actual instance of HDFS.
+ * TODO: figure out how to unit-test Kerberos-secured HDFS.
+ */
+public class TestHDFSEventSinkOnMiniCluster {
+
+  private static final Logger logger =
+      LoggerFactory.getLogger(TestHDFSEventSinkOnMiniCluster.class);
+
+  private static final boolean KEEP_DATA = false;
+  private static final String DFS_DIR = "target/test/dfs";
+  private static final String TEST_BUILD_DATA_KEY = "test.build.data";
+
+  private static MiniDFSCluster cluster = null;
+  private static String oldTestBuildDataProp = null;
+
+  @BeforeClass
+  public static void setupClass() throws IOException {
+    // set up data dir for HDFS
+    File dfsDir = new File(DFS_DIR);
+    if (!dfsDir.isDirectory()) {
+      dfsDir.mkdirs();
+    }
+    // save off system prop to restore later
+    oldTestBuildDataProp = System.getProperty(TEST_BUILD_DATA_KEY);
+    System.setProperty(TEST_BUILD_DATA_KEY, DFS_DIR);
+  }
+
+  private static String getNameNodeURL(MiniDFSCluster cluster) {
+    int nnPort = cluster.getNameNode().getNameNodeAddress().getPort();
+    return "hdfs://localhost:" + nnPort;
+  }
+
+  /**
+   * This is a very basic test that writes one event to HDFS and reads it back.
+   */
+  @Test
+  public void simpleHDFSTest() throws EventDeliveryException, IOException {
+    cluster = new MiniDFSCluster(new Configuration(), 1, true, null);
+    cluster.waitActive();
+
+    String outputDir = "/flume/simpleHDFSTest";
+    Path outputDirPath = new Path(outputDir);
+
+    logger.info("Running test with output dir: {}", outputDir);
+
+    FileSystem fs = cluster.getFileSystem();
+    // ensure output directory is empty
+    if (fs.exists(outputDirPath)) {
+      fs.delete(outputDirPath, true);
+    }
+
+    String nnURL = getNameNodeURL(cluster);
+    logger.info("Namenode address: {}", nnURL);
+
+    Context chanCtx = new Context();
+    MemoryChannel channel = new MemoryChannel();
+    channel.setName("simpleHDFSTest-mem-chan");
+    channel.configure(chanCtx);
+    channel.start();
+
+    Context sinkCtx = new Context();
+    sinkCtx.put("hdfs.path", nnURL + outputDir);
+    sinkCtx.put("hdfs.fileType", HDFSWriterFactory.DataStreamType);
+    sinkCtx.put("hdfs.batchSize", Integer.toString(1));
+
+    HDFSEventSink sink = new HDFSEventSink();
+    sink.setName("simpleHDFSTest-hdfs-sink");
+    sink.configure(sinkCtx);
+    sink.setChannel(channel);
+    sink.start();
+
+    // create an event
+    String EVENT_BODY = "yarg!";
+    channel.getTransaction().begin();
+    try {
+      channel.put(EventBuilder.withBody(EVENT_BODY, Charsets.UTF_8));
+      channel.getTransaction().commit();
+    } finally {
+      channel.getTransaction().close();
+    }
+
+    // store event to HDFS
+    sink.process();
+
+    // shut down flume
+    sink.stop();
+    channel.stop();
+
+    // verify that it's in HDFS and that its content is what we say it should be
+    FileStatus[] statuses = fs.listStatus(outputDirPath);
+    Assert.assertNotNull("No files found written to HDFS", statuses);
+    Assert.assertEquals("Only one file expected", 1, statuses.length);
+
+    for (FileStatus status : statuses) {
+      Path filePath = status.getPath();
+      logger.info("Found file on DFS: {}", filePath);
+      FSDataInputStream stream = fs.open(filePath);
+      BufferedReader reader = new BufferedReader(new InputStreamReader(stream));
+      String line = reader.readLine();
+      logger.info("First line in file {}: {}", filePath, line);
+      Assert.assertEquals(EVENT_BODY, line);
+    }
+
+    if (!KEEP_DATA) {
+      fs.delete(outputDirPath, true);
+    }
+
+    cluster.shutdown();
+    cluster = null;
+  }
+
+  /**
+   * Writes two events in GZIP-compressed serialize.
+   */
+  @Test
+  public void simpleHDFSGZipCompressedTest() throws EventDeliveryException, IOException {
+    cluster = new MiniDFSCluster(new Configuration(), 1, true, null);
+    cluster.waitActive();
+
+    String outputDir = "/flume/simpleHDFSGZipCompressedTest";
+    Path outputDirPath = new Path(outputDir);
+
+    logger.info("Running test with output dir: {}", outputDir);
+
+    FileSystem fs = cluster.getFileSystem();
+    // ensure output directory is empty
+    if (fs.exists(outputDirPath)) {
+      fs.delete(outputDirPath, true);
+    }
+
+    String nnURL = getNameNodeURL(cluster);
+    logger.info("Namenode address: {}", nnURL);
+
+    Context chanCtx = new Context();
+    MemoryChannel channel = new MemoryChannel();
+    channel.setName("simpleHDFSTest-mem-chan");
+    channel.configure(chanCtx);
+    channel.start();
+
+    Context sinkCtx = new Context();
+    sinkCtx.put("hdfs.path", nnURL + outputDir);
+    sinkCtx.put("hdfs.fileType", HDFSWriterFactory.CompStreamType);
+    sinkCtx.put("hdfs.batchSize", Integer.toString(1));
+    sinkCtx.put("hdfs.codeC", "gzip");
+
+    HDFSEventSink sink = new HDFSEventSink();
+    sink.setName("simpleHDFSTest-hdfs-sink");
+    sink.configure(sinkCtx);
+    sink.setChannel(channel);
+    sink.start();
+
+    // create an event
+    String EVENT_BODY_1 = "yarg1";
+    String EVENT_BODY_2 = "yarg2";
+    channel.getTransaction().begin();
+    try {
+      channel.put(EventBuilder.withBody(EVENT_BODY_1, Charsets.UTF_8));
+      channel.put(EventBuilder.withBody(EVENT_BODY_2, Charsets.UTF_8));
+      channel.getTransaction().commit();
+    } finally {
+      channel.getTransaction().close();
+    }
+
+    // store event to HDFS
+    sink.process();
+
+    // shut down flume
+    sink.stop();
+    channel.stop();
+
+    // verify that it's in HDFS and that its content is what we say it should be
+    FileStatus[] statuses = fs.listStatus(outputDirPath);
+    Assert.assertNotNull("No files found written to HDFS", statuses);
+    Assert.assertEquals("Only one file expected", 1, statuses.length);
+
+    for (FileStatus status : statuses) {
+      Path filePath = status.getPath();
+      logger.info("Found file on DFS: {}", filePath);
+      FSDataInputStream stream = fs.open(filePath);
+      BufferedReader reader = new BufferedReader(new InputStreamReader(
+          new GZIPInputStream(stream)));
+      String line = reader.readLine();
+      logger.info("First line in file {}: {}", filePath, line);
+      Assert.assertEquals(EVENT_BODY_1, line);
+
+      // The rest of this test is commented-out (will fail) for 2 reasons:
+      //
+      // (1) At the time of this writing, Hadoop has a bug which causes the
+      // non-native gzip implementation to create invalid gzip files when
+      // finish() and resetState() are called. See HADOOP-8522.
+      //
+      // (2) Even if HADOOP-8522 is fixed, the JDK GZipInputStream is unable
+      // to read multi-member (concatenated) gzip files. See this Sun bug:
+      // http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4691425
+      //
+      //line = reader.readLine();
+      //logger.info("Second line in file {}: {}", filePath, line);
+      //Assert.assertEquals(EVENT_BODY_2, line);
+    }
+
+    if (!KEEP_DATA) {
+      fs.delete(outputDirPath, true);
+    }
+
+    cluster.shutdown();
+    cluster = null;
+  }
+
+  /**
+   * This is a very basic test that writes one event to HDFS and reads it back.
+   */
+  @Test
+  public void underReplicationTest() throws EventDeliveryException,
+      IOException {
+    Configuration conf = new Configuration();
+    conf.set("dfs.replication", String.valueOf(3));
+    cluster = new MiniDFSCluster(conf, 3, true, null);
+    cluster.waitActive();
+
+    String outputDir = "/flume/underReplicationTest";
+    Path outputDirPath = new Path(outputDir);
+
+    logger.info("Running test with output dir: {}", outputDir);
+
+    FileSystem fs = cluster.getFileSystem();
+    // ensure output directory is empty
+    if (fs.exists(outputDirPath)) {
+      fs.delete(outputDirPath, true);
+    }
+
+    String nnURL = getNameNodeURL(cluster);
+    logger.info("Namenode address: {}", nnURL);
+
+    Context chanCtx = new Context();
+    MemoryChannel channel = new MemoryChannel();
+    channel.setName("simpleHDFSTest-mem-chan");
+    channel.configure(chanCtx);
+    channel.start();
+
+    Context sinkCtx = new Context();
+    sinkCtx.put("hdfs.path", nnURL + outputDir);
+    sinkCtx.put("hdfs.fileType", HDFSWriterFactory.DataStreamType);
+    sinkCtx.put("hdfs.batchSize", Integer.toString(1));
+
+    HDFSEventSink sink = new HDFSEventSink();
+    sink.setName("simpleHDFSTest-hdfs-sink");
+    sink.configure(sinkCtx);
+    sink.setChannel(channel);
+    sink.start();
+
+    // create an event
+    channel.getTransaction().begin();
+    try {
+      channel.put(EventBuilder.withBody("yarg 1", Charsets.UTF_8));
+      channel.put(EventBuilder.withBody("yarg 2", Charsets.UTF_8));
+      channel.put(EventBuilder.withBody("yarg 3", Charsets.UTF_8));
+      channel.put(EventBuilder.withBody("yarg 4", Charsets.UTF_8));
+      channel.put(EventBuilder.withBody("yarg 5", Charsets.UTF_8));
+      channel.put(EventBuilder.withBody("yarg 5", Charsets.UTF_8));
+      channel.getTransaction().commit();
+    } finally {
+      channel.getTransaction().close();
+    }
+
+    // store events to HDFS
+    logger.info("Running process(). Create new file.");
+    sink.process(); // create new file;
+    logger.info("Running process(). Same file.");
+    sink.process();
+
+    // kill a datanode
+    logger.info("Killing datanode #1...");
+    cluster.stopDataNode(0);
+
+    // there is a race here.. the client may or may not notice that the
+    // datanode is dead before it next sync()s.
+    // so, this next call may or may not roll a new file.
+
+    logger.info("Running process(). Create new file? (racy)");
+    sink.process();
+
+    logger.info("Running process(). Create new file.");
+    sink.process();
+
+    logger.info("Running process(). Create new file.");
+    sink.process();
+
+    logger.info("Running process(). Create new file.");
+    sink.process();
+
+    // shut down flume
+    sink.stop();
+    channel.stop();
+
+    // verify that it's in HDFS and that its content is what we say it should be
+    FileStatus[] statuses = fs.listStatus(outputDirPath);
+    Assert.assertNotNull("No files found written to HDFS", statuses);
+
+    for (FileStatus status : statuses) {
+      Path filePath = status.getPath();
+      logger.info("Found file on DFS: {}", filePath);
+      FSDataInputStream stream = fs.open(filePath);
+      BufferedReader reader = new BufferedReader(new InputStreamReader(stream));
+      String line = reader.readLine();
+      logger.info("First line in file {}: {}", filePath, line);
+      Assert.assertTrue(line.startsWith("yarg"));
+    }
+
+    Assert.assertTrue("4 or 5 files expected, found " + statuses.length,
+        statuses.length == 4 || statuses.length == 5);
+    System.out.println("There are " + statuses.length + " files.");
+
+    if (!KEEP_DATA) {
+      fs.delete(outputDirPath, true);
+    }
+
+    cluster.shutdown();
+    cluster = null;
+  }
+
+  /**
+   * This is a very basic test that writes one event to HDFS and reads it back.
+   */
+  @Ignore("This test is flakey and causes tests to fail pretty often.")
+  @Test
+  public void maxUnderReplicationTest() throws EventDeliveryException,
+      IOException {
+    Configuration conf = new Configuration();
+    conf.set("dfs.replication", String.valueOf(3));
+    cluster = new MiniDFSCluster(conf, 3, true, null);
+    cluster.waitActive();
+
+    String outputDir = "/flume/underReplicationTest";
+    Path outputDirPath = new Path(outputDir);
+
+    logger.info("Running test with output dir: {}", outputDir);
+
+    FileSystem fs = cluster.getFileSystem();
+    // ensure output directory is empty
+    if (fs.exists(outputDirPath)) {
+      fs.delete(outputDirPath, true);
+    }
+
+    String nnURL = getNameNodeURL(cluster);
+    logger.info("Namenode address: {}", nnURL);
+
+    Context chanCtx = new Context();
+    MemoryChannel channel = new MemoryChannel();
+    channel.setName("simpleHDFSTest-mem-chan");
+    channel.configure(chanCtx);
+    channel.start();
+
+    Context sinkCtx = new Context();
+    sinkCtx.put("hdfs.path", nnURL + outputDir);
+    sinkCtx.put("hdfs.fileType", HDFSWriterFactory.DataStreamType);
+    sinkCtx.put("hdfs.batchSize", Integer.toString(1));
+
+    HDFSEventSink sink = new HDFSEventSink();
+    sink.setName("simpleHDFSTest-hdfs-sink");
+    sink.configure(sinkCtx);
+    sink.setChannel(channel);
+    sink.start();
+
+    // create an event
+    channel.getTransaction().begin();
+    try {
+      for (int i = 0; i < 50; i++) {
+        channel.put(EventBuilder.withBody("yarg " + i, Charsets.UTF_8));
+      }
+      channel.getTransaction().commit();
+    } finally {
+      channel.getTransaction().close();
+    }
+
+    // store events to HDFS
+    logger.info("Running process(). Create new file.");
+    sink.process(); // create new file;
+    logger.info("Running process(). Same file.");
+    sink.process();
+
+    // kill a datanode
+    logger.info("Killing datanode #1...");
+    cluster.stopDataNode(0);
+
+    // there is a race here.. the client may or may not notice that the
+    // datanode is dead before it next sync()s.
+    // so, this next call may or may not roll a new file.
+
+    logger.info("Running process(). Create new file? (racy)");
+    sink.process();
+
+    for (int i = 3; i < 50; i++) {
+      logger.info("Running process().");
+      sink.process();
+    }
+
+    // shut down flume
+    sink.stop();
+    channel.stop();
+
+    // verify that it's in HDFS and that its content is what we say it should be
+    FileStatus[] statuses = fs.listStatus(outputDirPath);
+    Assert.assertNotNull("No files found written to HDFS", statuses);
+
+    for (FileStatus status : statuses) {
+      Path filePath = status.getPath();
+      logger.info("Found file on DFS: {}", filePath);
+      FSDataInputStream stream = fs.open(filePath);
+      BufferedReader reader = new BufferedReader(new InputStreamReader(stream));
+      String line = reader.readLine();
+      logger.info("First line in file {}: {}", filePath, line);
+      Assert.assertTrue(line.startsWith("yarg"));
+    }
+
+    System.out.println("There are " + statuses.length + " files.");
+    Assert.assertEquals("31 files expected, found " + statuses.length,
+        31, statuses.length);
+
+    if (!KEEP_DATA) {
+      fs.delete(outputDirPath, true);
+    }
+
+    cluster.shutdown();
+    cluster = null;
+  }
+
+  @AfterClass
+  public static void teardownClass() {
+    // restore system state, if needed
+    if (oldTestBuildDataProp != null) {
+      System.setProperty(TEST_BUILD_DATA_KEY, oldTestBuildDataProp);
+    }
+
+    if (!KEEP_DATA) {
+      FileUtils.deleteQuietly(new File(DFS_DIR));
+    }
+  }
+
+}
diff --git a/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestSequenceFileSerializerFactory.java b/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestSequenceFileSerializerFactory.java
new file mode 100644
index 0000000..974e857
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestSequenceFileSerializerFactory.java
@@ -0,0 +1,58 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.hdfs;
+
+import org.apache.flume.Context;
+import org.junit.Test;
+
+import static org.junit.Assert.assertTrue;
+
+public class TestSequenceFileSerializerFactory {
+
+  @Test
+  public void getTextFormatter() {
+    SequenceFileSerializer formatter =
+        SequenceFileSerializerFactory.getSerializer("Text", new Context());
+
+    assertTrue(formatter != null);
+    assertTrue(formatter.getClass().getName(),
+        formatter instanceof HDFSTextSerializer);
+  }
+
+  @Test
+  public void getWritableFormatter() {
+    SequenceFileSerializer formatter =
+        SequenceFileSerializerFactory.getSerializer("Writable", new Context());
+
+    assertTrue(formatter != null);
+    assertTrue(formatter.getClass().getName(),
+        formatter instanceof HDFSWritableSerializer);
+  }
+
+  @Test
+  public void getCustomFormatter() {
+    SequenceFileSerializer formatter = SequenceFileSerializerFactory.getSerializer(
+        "org.apache.flume.sink.hdfs.MyCustomSerializer$Builder", new Context());
+
+    assertTrue(formatter != null);
+    assertTrue(formatter.getClass().getName(),
+        formatter instanceof MyCustomSerializer);
+  }
+
+}
diff --git a/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestUseRawLocalFileSystem.java b/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestUseRawLocalFileSystem.java
new file mode 100644
index 0000000..f3e7d10
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestUseRawLocalFileSystem.java
@@ -0,0 +1,95 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.hdfs;
+
+import java.io.File;
+import org.apache.commons.io.FileUtils;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.event.EventBuilder;
+import org.apache.hadoop.io.SequenceFile.CompressionType;
+import org.apache.hadoop.io.compress.GzipCodec;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import com.google.common.base.Charsets;
+import com.google.common.io.Files;
+
+public class TestUseRawLocalFileSystem {
+
+  private static Logger logger =
+      LoggerFactory.getLogger(TestUseRawLocalFileSystem.class);
+  private Context context;
+
+  private File baseDir;
+  private File testFile;
+  private Event event;
+
+  @Before
+  public void setup() throws Exception {
+    baseDir = Files.createTempDir();
+    testFile = new File(baseDir.getAbsoluteFile(), "test");
+    context = new Context();
+    event = EventBuilder.withBody("test", Charsets.UTF_8);
+  }
+
+  @After
+  public void teardown() throws Exception {
+    FileUtils.deleteQuietly(baseDir);
+  }
+
+  @Test
+  public void testTestFile() throws Exception {
+    String file = testFile.getCanonicalPath();
+    HDFSDataStream stream = new HDFSDataStream();
+    context.put("hdfs.useRawLocalFileSystem", "true");
+    stream.configure(context);
+    stream.open(file);
+    stream.append(event);
+    stream.sync();
+    Assert.assertTrue(testFile.length() > 0);
+  }
+  @Test
+  public void testCompressedFile() throws Exception {
+    String file = testFile.getCanonicalPath();
+    HDFSCompressedDataStream stream = new HDFSCompressedDataStream();
+    context.put("hdfs.useRawLocalFileSystem", "true");
+    stream.configure(context);
+    stream.open(file, new GzipCodec(), CompressionType.RECORD);
+    stream.append(event);
+    stream.sync();
+    Assert.assertTrue(testFile.length() > 0);
+  }
+  @Test
+  public void testSequenceFile() throws Exception {
+    String file = testFile.getCanonicalPath();
+    HDFSSequenceFile stream = new HDFSSequenceFile();
+    context.put("hdfs.useRawLocalFileSystem", "true");
+    stream.configure(context);
+    stream.open(file);
+    stream.append(event);
+    stream.sync();
+    Assert.assertTrue(testFile.length() > 0);
+  }
+
+}
\ No newline at end of file
diff --git a/code/flume-ng-sinks/flume-hdfs-sink/src/test/resources/log4j.properties b/code/flume-ng-sinks/flume-hdfs-sink/src/test/resources/log4j.properties
new file mode 100644
index 0000000..252b5ea
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hdfs-sink/src/test/resources/log4j.properties
@@ -0,0 +1,26 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+log4j.rootLogger = INFO, out
+
+log4j.appender.out = org.apache.log4j.ConsoleAppender
+log4j.appender.out.layout = org.apache.log4j.PatternLayout
+log4j.appender.out.layout.ConversionPattern = %d (%t) [%p - %l] %m%n
+
+log4j.logger.org.apache.flume = DEBUG
+log4j.logger.org.apache.hadoop = WARN
+log4j.logger.org.mortbay = WARN
diff --git a/code/flume-ng-sinks/flume-hive-sink/pom.xml b/code/flume-ng-sinks/flume-hive-sink/pom.xml
new file mode 100644
index 0000000..6d9ee47
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hive-sink/pom.xml
@@ -0,0 +1,186 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to You under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
+
+  <modelVersion>4.0.0</modelVersion>
+
+  <parent>
+    <groupId>org.apache.flume</groupId>
+    <artifactId>flume-ng-sinks</artifactId>
+    <version>1.7.0</version>
+  </parent>
+
+  <groupId>org.apache.flume.flume-ng-sinks</groupId>
+  <artifactId>flume-hive-sink</artifactId>
+  <name>Flume NG Hive Sink</name>
+
+  <build>
+    <plugins>
+      <plugin>
+        <groupId>org.apache.rat</groupId>
+        <artifactId>apache-rat-plugin</artifactId>
+      </plugin>
+    </plugins>
+  </build>
+
+  <profiles>
+    <profile>
+      <id>hadoop-1.0</id>
+      <activation>
+        <property>
+          <name>flume.hadoop.profile</name>
+          <value>1</value>
+        </property>
+      </activation>
+
+      <dependencies>
+        <dependency>
+          <groupId>org.apache.hadoop</groupId>
+          <artifactId>hadoop-core</artifactId>
+          <version>${hadoop.version}</version>
+          <scope>test</scope>
+        </dependency>
+      </dependencies>
+    </profile>
+    <profile>
+      <id>hadoop-2</id>
+      <activation>
+        <property>
+          <name>flume.hadoop.profile</name>
+          <value>2</value>
+        </property>
+      </activation>
+      <dependencies>
+        <dependency>
+          <groupId>org.apache.hadoop</groupId>
+          <artifactId>hadoop-common</artifactId>
+          <version>${hadoop.version}</version>
+          <scope>test</scope>
+          <optional>true</optional>
+        </dependency>
+
+        <dependency>
+          <groupId>org.apache.hadoop</groupId>
+          <artifactId>hadoop-mapreduce-client-core</artifactId>
+          <scope>test</scope>
+          <version>${hadoop.version}</version>
+        </dependency>
+      </dependencies>
+    </profile>
+
+    <profile>
+      <id>hbase-1</id>
+      <activation>
+        <property>
+          <name>!flume.hadoop.profile</name>
+        </property>
+      </activation>
+      <dependencies>
+        <dependency>
+          <groupId>org.apache.hadoop</groupId>
+          <artifactId>hadoop-common</artifactId>
+          <scope>test</scope>
+          <optional>true</optional>
+        </dependency>
+
+        <dependency>
+          <groupId>org.apache.hadoop</groupId>
+          <artifactId>hadoop-mapreduce-client-core</artifactId>
+          <scope>test</scope>
+          <version>${hadoop.version}</version>
+        </dependency>
+      </dependencies>
+    </profile>
+  </profiles>
+
+  <dependencies>
+
+    <dependency>
+      <groupId>org.apache.flume</groupId>
+      <artifactId>flume-ng-sdk</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.flume</groupId>
+      <artifactId>flume-ng-configuration</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.flume</groupId>
+      <artifactId>flume-ng-core</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>org.slf4j</groupId>
+      <artifactId>slf4j-api</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>junit</groupId>
+      <artifactId>junit</artifactId>
+      <scope>test</scope>
+    </dependency>
+
+    <dependency>
+      <groupId>org.slf4j</groupId>
+      <artifactId>slf4j-log4j12</artifactId>
+      <scope>test</scope>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.hive.hcatalog</groupId>
+      <artifactId>hive-hcatalog-streaming</artifactId>
+      <scope>provided</scope>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.hive.hcatalog</groupId>
+      <artifactId>hive-hcatalog-core</artifactId>
+      <scope>provided</scope>
+      <version>${hive.version}</version>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.hive</groupId>
+      <artifactId>hive-cli</artifactId>
+      <scope>test</scope>
+    </dependency>
+
+    <!--temporary - really belongs to hive-streaming : roshan -->
+    <dependency>
+      <groupId>xerces</groupId>
+      <artifactId>xercesImpl</artifactId>
+      <scope>runtime</scope>
+      <version>2.9.1</version>
+    </dependency>
+
+    <dependency>
+      <groupId>xalan</groupId>
+      <artifactId>serializer</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>xalan</groupId>
+      <artifactId>xalan</artifactId>
+    </dependency>
+    <!-- end temporary -->
+
+  </dependencies>
+
+</project>
diff --git a/code/flume-ng-sinks/flume-hive-sink/src/main/java/org/apache/flume/sink/hive/Config.java b/code/flume-ng-sinks/flume-hive-sink/src/main/java/org/apache/flume/sink/hive/Config.java
new file mode 100644
index 0000000..b2d2582
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hive-sink/src/main/java/org/apache/flume/sink/hive/Config.java
@@ -0,0 +1,41 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flume.sink.hive;
+
+public class Config {
+  public static final String HIVE_METASTORE = "hive.metastore";
+  public static final String HIVE_DATABASE = "hive.database";
+  public static final String HIVE_TABLE = "hive.table";
+  public static final String HIVE_PARTITION = "hive.partition";
+  public static final String HIVE_TXNS_PER_BATCH_ASK = "hive.txnsPerBatchAsk";
+  public static final String BATCH_SIZE = "batchSize";
+  public static final String IDLE_TIMEOUT = "idleTimeout";
+  public static final String CALL_TIMEOUT = "callTimeout";
+  public static final String HEART_BEAT_INTERVAL = "heartBeatInterval";
+  public static final String MAX_OPEN_CONNECTIONS = "maxOpenConnections";
+  public static final String USE_LOCAL_TIME_STAMP = "useLocalTimeStamp";
+  public static final String TIME_ZONE = "timeZone";
+  public static final String ROUND_UNIT = "roundUnit";
+  public static final String ROUND = "round";
+  public static final String HOUR = "hour";
+  public static final String MINUTE = "minute";
+  public static final String SECOND = "second";
+  public static final String ROUND_VALUE = "roundValue";
+  public static final String SERIALIZER = "serializer";
+}
diff --git a/code/flume-ng-sinks/flume-hive-sink/src/main/java/org/apache/flume/sink/hive/HiveDelimitedTextSerializer.java b/code/flume-ng-sinks/flume-hive-sink/src/main/java/org/apache/flume/sink/hive/HiveDelimitedTextSerializer.java
new file mode 100644
index 0000000..59520e7
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hive-sink/src/main/java/org/apache/flume/sink/hive/HiveDelimitedTextSerializer.java
@@ -0,0 +1,115 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flume.sink.hive;
+
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.hive.hcatalog.streaming.DelimitedInputWriter;
+import org.apache.hive.hcatalog.streaming.HiveEndPoint;
+import org.apache.hive.hcatalog.streaming.RecordWriter;
+import org.apache.hive.hcatalog.streaming.StreamingException;
+import org.apache.hive.hcatalog.streaming.TransactionBatch;
+
+import java.io.IOException;
+import java.util.Collection;
+
+/** Forwards the incoming event body to Hive unmodified
+ * Sets up the delimiter and the field to column mapping
+ */
+public class HiveDelimitedTextSerializer implements HiveEventSerializer  {
+  public static final String ALIAS = "DELIMITED";
+
+  public static final String defaultDelimiter = ",";
+  public static final String SERIALIZER_DELIMITER = "serializer.delimiter";
+  public static final String SERIALIZER_FIELDNAMES = "serializer.fieldnames";
+  public static final String SERIALIZER_SERDE_SEPARATOR = "serializer.serdeSeparator";
+
+  private String delimiter;
+  private String[] fieldToColMapping = null;
+  private Character serdeSeparator = null;
+
+  @Override
+  public void write(TransactionBatch txnBatch, Event e)
+          throws StreamingException, IOException, InterruptedException {
+    txnBatch.write(e.getBody());
+  }
+
+  @Override
+  public void write(TransactionBatch txnBatch, Collection<byte[]> events)
+          throws StreamingException, IOException, InterruptedException {
+    txnBatch.write(events);
+  }
+
+
+  @Override
+  public RecordWriter createRecordWriter(HiveEndPoint endPoint)
+      throws StreamingException, IOException, ClassNotFoundException {
+    if (serdeSeparator == null) {
+      return new DelimitedInputWriter(fieldToColMapping, delimiter, endPoint);
+    }
+    return new DelimitedInputWriter(fieldToColMapping, delimiter, endPoint, null, serdeSeparator);
+  }
+
+  @Override
+  public void configure(Context context) {
+    delimiter = parseDelimiterSpec(
+            context.getString(SERIALIZER_DELIMITER, defaultDelimiter) );
+    String fieldNames = context.getString(SERIALIZER_FIELDNAMES);
+    if (fieldNames == null) {
+      throw new IllegalArgumentException("serializer.fieldnames is not specified " +
+              "for serializer " + this.getClass().getName() );
+    }
+    String serdeSeparatorStr = context.getString(SERIALIZER_SERDE_SEPARATOR);
+    this.serdeSeparator = parseSerdeSeparatorSpec(serdeSeparatorStr);
+
+    // split, but preserve empty fields (-1)
+    fieldToColMapping = fieldNames.trim().split(",",-1);
+  }
+
+  // if delimiter is a double quoted like "\t", drop quotes
+  private static String parseDelimiterSpec(String delimiter) {
+    if (delimiter == null) {
+      return null;
+    }
+    if (delimiter.charAt(0) == '"'  &&
+        delimiter.charAt(delimiter.length() - 1) == '"') {
+      return delimiter.substring(1,delimiter.length() - 1);
+    }
+    return delimiter;
+  }
+
+  // if delimiter is a single quoted character like '\t', drop quotes
+  private static  Character parseSerdeSeparatorSpec(String separatorStr) {
+    if (separatorStr == null) {
+      return null;
+    }
+    if (separatorStr.length() == 1) {
+      return separatorStr.charAt(0);
+    }
+    if (separatorStr.length() == 3    &&
+        separatorStr.charAt(2) == '\''  &&
+        separatorStr.charAt(separatorStr.length() - 1) == '\'') {
+      return separatorStr.charAt(1);
+    }
+
+    throw new IllegalArgumentException("serializer.serdeSeparator spec is invalid " +
+            "for " + ALIAS + " serializer " );
+  }
+
+}
diff --git a/code/flume-ng-sinks/flume-hive-sink/src/main/java/org/apache/flume/sink/hive/HiveEventSerializer.java b/code/flume-ng-sinks/flume-hive-sink/src/main/java/org/apache/flume/sink/hive/HiveEventSerializer.java
new file mode 100644
index 0000000..7ed2c82
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hive-sink/src/main/java/org/apache/flume/sink/hive/HiveEventSerializer.java
@@ -0,0 +1,41 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flume.sink.hive;
+
+import org.apache.flume.Event;
+import org.apache.flume.conf.Configurable;
+import org.apache.hive.hcatalog.streaming.HiveEndPoint;
+import org.apache.hive.hcatalog.streaming.RecordWriter;
+import org.apache.hive.hcatalog.streaming.StreamingException;
+import org.apache.hive.hcatalog.streaming.TransactionBatch;
+
+import java.io.IOException;
+import java.util.Collection;
+
+public interface HiveEventSerializer extends Configurable {
+  public void write(TransactionBatch batch, Event e)
+          throws StreamingException, IOException, InterruptedException;
+
+  public void write(TransactionBatch txnBatch, Collection<byte[]> events)
+          throws StreamingException, IOException, InterruptedException;
+
+  RecordWriter createRecordWriter(HiveEndPoint endPoint)
+          throws StreamingException, IOException, ClassNotFoundException;
+
+}
diff --git a/code/flume-ng-sinks/flume-hive-sink/src/main/java/org/apache/flume/sink/hive/HiveJsonSerializer.java b/code/flume-ng-sinks/flume-hive-sink/src/main/java/org/apache/flume/sink/hive/HiveJsonSerializer.java
new file mode 100644
index 0000000..0311a5b
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hive-sink/src/main/java/org/apache/flume/sink/hive/HiveJsonSerializer.java
@@ -0,0 +1,62 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flume.sink.hive;
+
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.hive.hcatalog.streaming.HiveEndPoint;
+import org.apache.hive.hcatalog.streaming.RecordWriter;
+import org.apache.hive.hcatalog.streaming.StreamingException;
+import org.apache.hive.hcatalog.streaming.StrictJsonWriter;
+import org.apache.hive.hcatalog.streaming.TransactionBatch;
+
+import java.io.IOException;
+import java.util.Collection;
+
+/** Forwards the incoming event body to Hive unmodified
+ * Sets up the delimiter and the field to column mapping
+ */
+
+public class HiveJsonSerializer implements HiveEventSerializer  {
+  public static final String ALIAS = "JSON";
+
+  @Override
+  public void write(TransactionBatch txnBatch, Event e)
+          throws StreamingException, IOException, InterruptedException {
+    txnBatch.write(e.getBody());
+  }
+
+  @Override
+  public void write(TransactionBatch txnBatch, Collection<byte[]> events)
+          throws StreamingException, IOException, InterruptedException {
+    txnBatch.write(events);
+  }
+
+  @Override
+  public RecordWriter createRecordWriter(HiveEndPoint endPoint)
+          throws StreamingException, IOException, ClassNotFoundException {
+    return new StrictJsonWriter(endPoint);
+  }
+
+  @Override
+  public void configure(Context context) {
+    return;
+  }
+
+}
diff --git a/code/flume-ng-sinks/flume-hive-sink/src/main/java/org/apache/flume/sink/hive/HiveSink.java b/code/flume-ng-sinks/flume-hive-sink/src/main/java/org/apache/flume/sink/hive/HiveSink.java
new file mode 100644
index 0000000..cc5cdca
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hive-sink/src/main/java/org/apache/flume/sink/hive/HiveSink.java
@@ -0,0 +1,522 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flume.sink.hive;
+
+import com.google.common.annotations.VisibleForTesting;
+import com.google.common.base.Preconditions;
+import com.google.common.collect.Lists;
+import com.google.common.collect.Maps;
+import com.google.common.util.concurrent.ThreadFactoryBuilder;
+import org.apache.flume.Channel;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.EventDeliveryException;
+import org.apache.flume.Transaction;
+import org.apache.flume.conf.Configurable;
+import org.apache.flume.formatter.output.BucketPath;
+import org.apache.flume.instrumentation.SinkCounter;
+import org.apache.flume.sink.AbstractSink;
+import org.apache.hive.hcatalog.streaming.HiveEndPoint;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Calendar;
+import java.util.List;
+import java.util.Map;
+import java.util.Map.Entry;
+import java.util.TimeZone;
+import java.util.Timer;
+import java.util.TimerTask;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.atomic.AtomicBoolean;
+
+public class HiveSink extends AbstractSink implements Configurable {
+
+  private static final Logger LOG = LoggerFactory.getLogger(HiveSink.class);
+
+  private static final int DEFAULT_MAXOPENCONNECTIONS = 500;
+  private static final int DEFAULT_TXNSPERBATCH = 100;
+  private static final int DEFAULT_BATCHSIZE = 15000;
+  private static final int DEFAULT_CALLTIMEOUT = 10000;
+  private static final int DEFAULT_IDLETIMEOUT = 0;
+  private static final int DEFAULT_HEARTBEATINTERVAL = 240; // seconds
+
+  private Map<HiveEndPoint, HiveWriter> allWriters;
+
+  private SinkCounter sinkCounter;
+  private volatile int idleTimeout;
+  private String metaStoreUri;
+  private String proxyUser;
+  private String database;
+  private String table;
+  private List<String> partitionVals;
+  private Integer txnsPerBatchAsk;
+  private Integer batchSize;
+  private Integer maxOpenConnections;
+  private boolean autoCreatePartitions;
+  private String serializerType;
+  private HiveEventSerializer serializer;
+
+  /**
+   * Default timeout for blocking I/O calls in HiveWriter
+   */
+  private Integer callTimeout;
+  private Integer heartBeatInterval;
+
+  private ExecutorService callTimeoutPool;
+
+  private boolean useLocalTime;
+  private TimeZone timeZone;
+  private boolean needRounding;
+  private int roundUnit;
+  private Integer roundValue;
+
+  private Timer heartBeatTimer = new Timer();
+  private AtomicBoolean timeToSendHeartBeat = new AtomicBoolean(false);
+
+  @VisibleForTesting
+  Map<HiveEndPoint, HiveWriter> getAllWriters() {
+    return allWriters;
+  }
+
+  // read configuration and setup thresholds
+  @Override
+  public void configure(Context context) {
+
+    metaStoreUri = context.getString(Config.HIVE_METASTORE);
+    if (metaStoreUri == null) {
+      throw new IllegalArgumentException(Config.HIVE_METASTORE + " config setting is not " +
+              "specified for sink " + getName());
+    }
+    if (metaStoreUri.equalsIgnoreCase("null")) { // for testing support
+      metaStoreUri = null;
+    }
+    proxyUser = null; // context.getString("hive.proxyUser"); not supported by hive api yet
+    database = context.getString(Config.HIVE_DATABASE);
+    if (database == null) {
+      throw new IllegalArgumentException(Config.HIVE_DATABASE + " config setting is not " +
+            "specified for sink " + getName());
+    }
+    table = context.getString(Config.HIVE_TABLE);
+    if (table == null) {
+      throw new IllegalArgumentException(Config.HIVE_TABLE + " config setting is not " +
+              "specified for sink " + getName());
+    }
+
+    String partitions = context.getString(Config.HIVE_PARTITION);
+    if (partitions != null) {
+      partitionVals = Arrays.asList(partitions.split(","));
+    }
+
+
+    txnsPerBatchAsk = context.getInteger(Config.HIVE_TXNS_PER_BATCH_ASK, DEFAULT_TXNSPERBATCH);
+    if (txnsPerBatchAsk < 0) {
+      LOG.warn(getName() + ". hive.txnsPerBatchAsk must be  positive number. Defaulting to "
+              + DEFAULT_TXNSPERBATCH);
+      txnsPerBatchAsk = DEFAULT_TXNSPERBATCH;
+    }
+    batchSize = context.getInteger(Config.BATCH_SIZE, DEFAULT_BATCHSIZE);
+    if (batchSize < 0) {
+      LOG.warn(getName() + ". batchSize must be  positive number. Defaulting to "
+              + DEFAULT_BATCHSIZE);
+      batchSize = DEFAULT_BATCHSIZE;
+    }
+    idleTimeout = context.getInteger(Config.IDLE_TIMEOUT, DEFAULT_IDLETIMEOUT);
+    if (idleTimeout < 0) {
+      LOG.warn(getName() + ". idleTimeout must be  positive number. Defaulting to "
+              + DEFAULT_IDLETIMEOUT);
+      idleTimeout = DEFAULT_IDLETIMEOUT;
+    }
+    callTimeout = context.getInteger(Config.CALL_TIMEOUT, DEFAULT_CALLTIMEOUT);
+    if (callTimeout < 0) {
+      LOG.warn(getName() + ". callTimeout must be  positive number. Defaulting to "
+              + DEFAULT_CALLTIMEOUT);
+      callTimeout = DEFAULT_CALLTIMEOUT;
+    }
+
+    heartBeatInterval = context.getInteger(Config.HEART_BEAT_INTERVAL, DEFAULT_HEARTBEATINTERVAL);
+    if (heartBeatInterval < 0) {
+      LOG.warn(getName() + ". heartBeatInterval must be  positive number. Defaulting to "
+              + DEFAULT_HEARTBEATINTERVAL);
+      heartBeatInterval = DEFAULT_HEARTBEATINTERVAL;
+    }
+    maxOpenConnections = context.getInteger(Config.MAX_OPEN_CONNECTIONS,
+                                            DEFAULT_MAXOPENCONNECTIONS);
+    autoCreatePartitions =  context.getBoolean("autoCreatePartitions", true);
+
+    // Timestamp processing
+    useLocalTime = context.getBoolean(Config.USE_LOCAL_TIME_STAMP, false);
+
+    String tzName = context.getString(Config.TIME_ZONE);
+    timeZone = (tzName == null) ? null : TimeZone.getTimeZone(tzName);
+    needRounding = context.getBoolean(Config.ROUND, false);
+
+    String unit = context.getString(Config.ROUND_UNIT, Config.MINUTE);
+    if (unit.equalsIgnoreCase(Config.HOUR)) {
+      this.roundUnit = Calendar.HOUR_OF_DAY;
+    } else if (unit.equalsIgnoreCase(Config.MINUTE)) {
+      this.roundUnit = Calendar.MINUTE;
+    } else if (unit.equalsIgnoreCase(Config.SECOND)) {
+      this.roundUnit = Calendar.SECOND;
+    } else {
+      LOG.warn(getName() + ". Rounding unit is not valid, please set one of " +
+              "minute, hour or second. Rounding will be disabled");
+      needRounding = false;
+    }
+    this.roundValue = context.getInteger(Config.ROUND_VALUE, 1);
+    if (roundUnit == Calendar.SECOND || roundUnit == Calendar.MINUTE) {
+      Preconditions.checkArgument(roundValue > 0 && roundValue <= 60,
+              "Round value must be > 0 and <= 60");
+    } else if (roundUnit == Calendar.HOUR_OF_DAY) {
+      Preconditions.checkArgument(roundValue > 0 && roundValue <= 24,
+              "Round value must be > 0 and <= 24");
+    }
+
+    // Serializer
+    serializerType = context.getString(Config.SERIALIZER, "");
+    if (serializerType.isEmpty()) {
+      throw new IllegalArgumentException("serializer config setting is not " +
+              "specified for sink " + getName());
+    }
+
+    serializer = createSerializer(serializerType);
+    serializer.configure(context);
+
+    Preconditions.checkArgument(batchSize > 0, "batchSize must be greater than 0");
+
+    if (sinkCounter == null) {
+      sinkCounter = new SinkCounter(getName());
+    }
+  }
+
+  @VisibleForTesting
+  protected SinkCounter getCounter() {
+    return sinkCounter;
+  }
+  private HiveEventSerializer createSerializer(String serializerName)  {
+    if (serializerName.compareToIgnoreCase(HiveDelimitedTextSerializer.ALIAS) == 0 ||
+        serializerName.compareTo(HiveDelimitedTextSerializer.class.getName()) == 0) {
+      return new HiveDelimitedTextSerializer();
+    } else if (serializerName.compareToIgnoreCase(HiveJsonSerializer.ALIAS) == 0 ||
+            serializerName.compareTo(HiveJsonSerializer.class.getName()) == 0) {
+      return new HiveJsonSerializer();
+    }
+
+    try {
+      return (HiveEventSerializer) Class.forName(serializerName).newInstance();
+    } catch (Exception e) {
+      throw new IllegalArgumentException("Unable to instantiate serializer: " + serializerName
+              + " on sink: " + getName(), e);
+    }
+  }
+
+
+  /**
+   * Pull events out of channel, find corresponding HiveWriter and write to it.
+   * Take at most batchSize events per Transaction. <br/>
+   * This method is not thread safe.
+   */
+  public Status process() throws EventDeliveryException {
+    // writers used in this Txn
+
+    Channel channel = getChannel();
+    Transaction transaction = channel.getTransaction();
+    transaction.begin();
+    boolean success = false;
+    try {
+      // 1 Enable Heart Beats
+      if (timeToSendHeartBeat.compareAndSet(true, false)) {
+        enableHeartBeatOnAllWriters();
+      }
+
+      // 2 Drain Batch
+      int txnEventCount = drainOneBatch(channel);
+      transaction.commit();
+      success = true;
+
+      // 3 Update Counters
+      if (txnEventCount < 1) {
+        return Status.BACKOFF;
+      } else {
+        return Status.READY;
+      }
+    } catch (InterruptedException err) {
+      LOG.warn(getName() + ": Thread was interrupted.", err);
+      return Status.BACKOFF;
+    } catch (Exception e) {
+      throw new EventDeliveryException(e);
+    } finally {
+      if (!success) {
+        transaction.rollback();
+      }
+      transaction.close();
+    }
+  }
+
+  // Drains one batch of events from Channel into Hive
+  private int drainOneBatch(Channel channel)
+          throws HiveWriter.Failure, InterruptedException {
+    int txnEventCount = 0;
+    try {
+      Map<HiveEndPoint,HiveWriter> activeWriters = Maps.newHashMap();
+      for (; txnEventCount < batchSize; ++txnEventCount) {
+        // 0) Read event from Channel
+        Event event = channel.take();
+        if (event == null) {
+          break;
+        }
+
+        //1) Create end point by substituting place holders
+        HiveEndPoint endPoint = makeEndPoint(metaStoreUri, database, table,
+                partitionVals, event.getHeaders(), timeZone,
+                needRounding, roundUnit, roundValue, useLocalTime);
+
+        //2) Create or reuse Writer
+        HiveWriter writer = getOrCreateWriter(activeWriters, endPoint);
+
+        //3) Write
+        LOG.debug("{} : Writing event to {}", getName(), endPoint);
+        writer.write(event);
+
+      } // for
+
+      //4) Update counters
+      if (txnEventCount == 0) {
+        sinkCounter.incrementBatchEmptyCount();
+      } else if (txnEventCount == batchSize) {
+        sinkCounter.incrementBatchCompleteCount();
+      } else {
+        sinkCounter.incrementBatchUnderflowCount();
+      }
+      sinkCounter.addToEventDrainAttemptCount(txnEventCount);
+
+
+      // 5) Flush all Writers
+      for (HiveWriter writer : activeWriters.values()) {
+        writer.flush(true);
+      }
+
+      sinkCounter.addToEventDrainSuccessCount(txnEventCount);
+      return txnEventCount;
+    } catch (HiveWriter.Failure e) {
+      // in case of error we close all TxnBatches to start clean next time
+      LOG.warn(getName() + " : " + e.getMessage(), e);
+      abortAllWriters();
+      closeAllWriters();
+      throw e;
+    }
+  }
+
+  private void enableHeartBeatOnAllWriters() {
+    for (HiveWriter writer : allWriters.values()) {
+      writer.setHearbeatNeeded();
+    }
+  }
+
+  private HiveWriter getOrCreateWriter(Map<HiveEndPoint, HiveWriter> activeWriters,
+                                       HiveEndPoint endPoint)
+          throws HiveWriter.ConnectException, InterruptedException {
+    try {
+      HiveWriter writer = allWriters.get( endPoint );
+      if (writer == null) {
+        LOG.info(getName() + ": Creating Writer to Hive end point : " + endPoint);
+        writer = new HiveWriter(endPoint, txnsPerBatchAsk, autoCreatePartitions,
+                callTimeout, callTimeoutPool, proxyUser, serializer, sinkCounter);
+
+        sinkCounter.incrementConnectionCreatedCount();
+        if (allWriters.size() > maxOpenConnections) {
+          int retired = closeIdleWriters();
+          if (retired == 0) {
+            closeEldestWriter();
+          }
+        }
+        allWriters.put(endPoint, writer);
+        activeWriters.put(endPoint, writer);
+      } else {
+        if (activeWriters.get(endPoint) == null)  {
+          activeWriters.put(endPoint,writer);
+        }
+      }
+      return writer;
+    } catch (HiveWriter.ConnectException e) {
+      sinkCounter.incrementConnectionFailedCount();
+      throw e;
+    }
+
+  }
+
+  private HiveEndPoint makeEndPoint(String metaStoreUri, String database, String table,
+                                    List<String> partVals, Map<String, String> headers,
+                                    TimeZone timeZone, boolean needRounding,
+                                    int roundUnit, Integer roundValue,
+                                    boolean useLocalTime)  {
+    if (partVals == null) {
+      return new HiveEndPoint(metaStoreUri, database, table, null);
+    }
+
+    ArrayList<String> realPartVals = Lists.newArrayList();
+    for (String partVal : partVals) {
+      realPartVals.add(BucketPath.escapeString(partVal, headers, timeZone,
+              needRounding, roundUnit, roundValue, useLocalTime));
+    }
+    return new HiveEndPoint(metaStoreUri, database, table, realPartVals);
+  }
+
+  /**
+   * Locate writer that has not been used for longest time and retire it
+   */
+  private void closeEldestWriter() throws InterruptedException {
+    long oldestTimeStamp = System.currentTimeMillis();
+    HiveEndPoint eldest = null;
+    for (Entry<HiveEndPoint,HiveWriter> entry : allWriters.entrySet()) {
+      if (entry.getValue().getLastUsed() < oldestTimeStamp) {
+        eldest = entry.getKey();
+        oldestTimeStamp = entry.getValue().getLastUsed();
+      }
+    }
+
+    try {
+      sinkCounter.incrementConnectionCreatedCount();
+      LOG.info(getName() + ": Closing least used Writer to Hive EndPoint : " + eldest);
+      allWriters.remove(eldest).close();
+    } catch (InterruptedException e) {
+      LOG.warn(getName() + ": Interrupted when attempting to close writer for end point: "
+              + eldest, e);
+      throw e;
+    }
+  }
+
+  /**
+   * Locate all writers past idle timeout and retire them
+   * @return number of writers retired
+   */
+  private int closeIdleWriters() throws InterruptedException {
+    int count = 0;
+    long now = System.currentTimeMillis();
+    ArrayList<HiveEndPoint> retirees = Lists.newArrayList();
+
+    //1) Find retirement candidates
+    for (Entry<HiveEndPoint,HiveWriter> entry : allWriters.entrySet()) {
+      if (now - entry.getValue().getLastUsed() > idleTimeout) {
+        ++count;
+        retirees.add(entry.getKey());
+      }
+    }
+    //2) Retire them
+    for (HiveEndPoint ep : retirees) {
+      sinkCounter.incrementConnectionClosedCount();
+      LOG.info(getName() + ": Closing idle Writer to Hive end point : {}", ep);
+      allWriters.remove(ep).close();
+    }
+    return count;
+  }
+
+  /**
+   * Closes all writers and remove them from cache
+   * @return number of writers retired
+   */
+  private void closeAllWriters() throws InterruptedException {
+    //1) Retire writers
+    for (Entry<HiveEndPoint,HiveWriter> entry : allWriters.entrySet()) {
+      entry.getValue().close();
+    }
+
+    //2) Clear cache
+    allWriters.clear();
+  }
+
+  /**
+   * Abort current Txn on all writers
+   * @return number of writers retired
+   */
+  private void abortAllWriters() throws InterruptedException {
+    for (Entry<HiveEndPoint,HiveWriter> entry : allWriters.entrySet()) {
+      entry.getValue().abort();
+    }
+  }
+
+  @Override
+  public void stop() {
+    // do not constrain close() calls with a timeout
+    for (Entry<HiveEndPoint, HiveWriter> entry : allWriters.entrySet()) {
+      try {
+        HiveWriter w = entry.getValue();
+        w.close();
+      } catch (InterruptedException ex) {
+        Thread.currentThread().interrupt();
+      }
+    }
+
+    // shut down all thread pools
+    callTimeoutPool.shutdown();
+    try {
+      while (callTimeoutPool.isTerminated() == false) {
+        callTimeoutPool.awaitTermination(
+              Math.max(DEFAULT_CALLTIMEOUT, callTimeout), TimeUnit.MILLISECONDS);
+      }
+    } catch (InterruptedException ex) {
+      LOG.warn(getName() + ":Shutdown interrupted on " + callTimeoutPool, ex);
+    }
+
+    callTimeoutPool = null;
+    allWriters.clear();
+    allWriters = null;
+    sinkCounter.stop();
+    super.stop();
+    LOG.info("Hive Sink {} stopped", getName() );
+  }
+
+  @Override
+  public void start() {
+    String timeoutName = "hive-" + getName() + "-call-runner-%d";
+    // call timeout pool needs only 1 thd as sink is effectively single threaded
+    callTimeoutPool = Executors.newFixedThreadPool(1,
+            new ThreadFactoryBuilder().setNameFormat(timeoutName).build());
+
+    this.allWriters = Maps.newHashMap();
+    sinkCounter.start();
+    super.start();
+    setupHeartBeatTimer();
+    LOG.info(getName() + ": Hive Sink {} started", getName() );
+  }
+
+  private void setupHeartBeatTimer() {
+    if (heartBeatInterval > 0) {
+      heartBeatTimer.schedule(new TimerTask() {
+        @Override
+        public void run() {
+          timeToSendHeartBeat.set(true);
+          setupHeartBeatTimer();
+        }
+      }, heartBeatInterval * 1000);
+    }
+  }
+
+
+  @Override
+  public String toString() {
+    return "{ Sink type:" + getClass().getSimpleName() + ", name:" + getName() +
+            " }";
+  }
+
+}
diff --git a/code/flume-ng-sinks/flume-hive-sink/src/main/java/org/apache/flume/sink/hive/HiveWriter.java b/code/flume-ng-sinks/flume-hive-sink/src/main/java/org/apache/flume/sink/hive/HiveWriter.java
new file mode 100644
index 0000000..7106696
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hive-sink/src/main/java/org/apache/flume/sink/hive/HiveWriter.java
@@ -0,0 +1,513 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flume.sink.hive;
+
+import org.apache.flume.Event;
+import org.apache.flume.instrumentation.SinkCounter;
+import org.apache.hive.hcatalog.streaming.HiveEndPoint;
+import org.apache.hive.hcatalog.streaming.RecordWriter;
+import org.apache.hive.hcatalog.streaming.SerializationError;
+import org.apache.hive.hcatalog.streaming.StreamingConnection;
+import org.apache.hive.hcatalog.streaming.StreamingException;
+import org.apache.hive.hcatalog.streaming.StreamingIOFailure;
+import org.apache.hive.hcatalog.streaming.TransactionBatch;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.concurrent.Callable;
+import java.util.concurrent.ExecutionException;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Future;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.TimeoutException;
+
+/**
+ * Internal API intended for HiveSink use.
+ */
+class HiveWriter {
+
+  private static final Logger LOG = LoggerFactory.getLogger(HiveWriter.class);
+
+  private final HiveEndPoint endPoint;
+  private HiveEventSerializer serializer;
+  private final StreamingConnection connection;
+  private final int txnsPerBatch;
+  private final RecordWriter recordWriter;
+  private TransactionBatch txnBatch;
+
+  private final ExecutorService callTimeoutPool;
+
+  private final long callTimeout;
+
+  private long lastUsed; // time of last flush on this writer
+
+  private SinkCounter sinkCounter;
+  private int batchCounter;
+  private long eventCounter;
+  private long processSize;
+
+  protected boolean closed; // flag indicating HiveWriter was closed
+  private boolean autoCreatePartitions;
+
+  private boolean hearbeatNeeded = false;
+
+  private final int writeBatchSz = 1000;
+  private ArrayList<Event> batch = new ArrayList<Event>(writeBatchSz);
+
+  HiveWriter(HiveEndPoint endPoint, int txnsPerBatch,
+             boolean autoCreatePartitions, long callTimeout,
+             ExecutorService callTimeoutPool, String hiveUser,
+             HiveEventSerializer serializer, SinkCounter sinkCounter)
+      throws ConnectException, InterruptedException {
+    try {
+      this.autoCreatePartitions = autoCreatePartitions;
+      this.sinkCounter = sinkCounter;
+      this.callTimeout = callTimeout;
+      this.callTimeoutPool = callTimeoutPool;
+      this.endPoint = endPoint;
+      this.connection = newConnection(hiveUser);
+      this.txnsPerBatch = txnsPerBatch;
+      this.serializer = serializer;
+      this.recordWriter = serializer.createRecordWriter(endPoint);
+      this.txnBatch = nextTxnBatch(recordWriter);
+      this.txnBatch.beginNextTransaction();
+      this.closed = false;
+      this.lastUsed = System.currentTimeMillis();
+    } catch (InterruptedException e) {
+      throw e;
+    } catch (RuntimeException e) {
+      throw e;
+    } catch (Exception e) {
+      throw new ConnectException(endPoint, e);
+    }
+  }
+
+  @Override
+  public String toString() {
+    return endPoint.toString();
+  }
+
+  /**
+   * Clear the class counters
+   */
+  private void resetCounters() {
+    eventCounter = 0;
+    processSize = 0;
+    batchCounter = 0;
+  }
+
+  void setHearbeatNeeded() {
+    hearbeatNeeded = true;
+  }
+
+  public int getRemainingTxns() {
+    return txnBatch.remainingTransactions();
+  }
+
+
+  /**
+   * Write data, update stats
+   * @param event
+   * @throws WriteException - other streaming io error
+   * @throws InterruptedException
+   */
+  public synchronized void write(final Event event)
+      throws WriteException, InterruptedException {
+    if (closed) {
+      throw new IllegalStateException("Writer closed. Cannot write to : " + endPoint);
+    }
+
+    batch.add(event);
+    if (batch.size() == writeBatchSz) {
+      // write the event
+      writeEventBatchToSerializer();
+    }
+
+    // Update Statistics
+    processSize += event.getBody().length;
+    eventCounter++;
+  }
+
+  private void writeEventBatchToSerializer()
+      throws InterruptedException, WriteException {
+    try {
+      timedCall(new CallRunner1<Void>() {
+        @Override
+        public Void call() throws InterruptedException, StreamingException {
+          try {
+            for (Event event : batch) {
+              try {
+                serializer.write(txnBatch, event);
+              } catch (SerializationError err) {
+                LOG.info("Parse failed : {}  : {}", err.getMessage(), new String(event.getBody()));
+              }
+            }
+            return null;
+          } catch (IOException e) {
+            throw new StreamingIOFailure(e.getMessage(), e);
+          }
+        }
+      });
+      batch.clear();
+    } catch (StreamingException e) {
+      throw new WriteException(endPoint, txnBatch.getCurrentTxnId(), e);
+    } catch (TimeoutException e) {
+      throw new WriteException(endPoint, txnBatch.getCurrentTxnId(), e);
+    }
+  }
+
+  /**
+   * Commits the current Txn.
+   * If 'rollToNext' is true, will switch to next Txn in batch or to a
+   *       new TxnBatch if current Txn batch is exhausted
+   */
+  public void flush(boolean rollToNext)
+      throws CommitException, TxnBatchException, TxnFailure, InterruptedException,
+      WriteException {
+    if (!batch.isEmpty()) {
+      writeEventBatchToSerializer();
+      batch.clear();
+    }
+
+    //0 Heart beat on TxnBatch
+    if (hearbeatNeeded) {
+      hearbeatNeeded = false;
+      heartBeat();
+    }
+    lastUsed = System.currentTimeMillis();
+
+    try {
+      //1 commit txn & close batch if needed
+      commitTxn();
+      if (txnBatch.remainingTransactions() == 0) {
+        closeTxnBatch();
+        txnBatch = null;
+        if (rollToNext) {
+          txnBatch = nextTxnBatch(recordWriter);
+        }
+      }
+
+      //2 roll to next Txn
+      if (rollToNext) {
+        LOG.debug("Switching to next Txn for {}", endPoint);
+        txnBatch.beginNextTransaction(); // does not block
+      }
+    } catch (StreamingException e) {
+      throw new TxnFailure(txnBatch, e);
+    }
+  }
+
+  /**
+   * Aborts the current Txn
+   * @throws InterruptedException
+   */
+  public void abort() throws InterruptedException {
+    batch.clear();
+    abortTxn();
+  }
+
+  /** Queues up a heartbeat request on the current and remaining txns using the
+   *  heartbeatThdPool and returns immediately
+   */
+  public void heartBeat() throws InterruptedException {
+    // 1) schedule the heartbeat on one thread in pool
+    try {
+      timedCall(new CallRunner1<Void>() {
+        @Override
+        public Void call() throws StreamingException {
+          LOG.info("Sending heartbeat on batch " + txnBatch);
+          txnBatch.heartbeat();
+          return null;
+        }
+      });
+    } catch (InterruptedException e) {
+      throw e;
+    } catch (Exception e) {
+      LOG.warn("Unable to send heartbeat on Txn Batch " + txnBatch, e);
+      // Suppressing exceptions as we don't care for errors on heartbeats
+    }
+  }
+
+  /**
+   * Close the Transaction Batch and connection
+   * @throws IOException
+   * @throws InterruptedException
+   */
+  public void close() throws InterruptedException {
+    batch.clear();
+    abortRemainingTxns();
+    closeTxnBatch();
+    closeConnection();
+    closed = true;
+  }
+
+
+  private void abortRemainingTxns() throws InterruptedException {
+    try {
+      if (!isClosed(txnBatch.getCurrentTransactionState())) {
+        abortCurrTxnHelper();
+      }
+
+      // recursively abort remaining txns
+      if (txnBatch.remainingTransactions() > 0) {
+        timedCall(
+            new CallRunner1<Void>() {
+              @Override
+              public Void call() throws StreamingException, InterruptedException {
+                txnBatch.beginNextTransaction();
+                return null;
+              }
+            });
+        abortRemainingTxns();
+      }
+    } catch (StreamingException e) {
+      LOG.warn("Error when aborting remaining transactions in batch " + txnBatch, e);
+      return;
+    } catch (TimeoutException e) {
+      LOG.warn("Timed out when aborting remaining transactions in batch " + txnBatch, e);
+      return;
+    }
+  }
+
+  private void abortCurrTxnHelper() throws TimeoutException, InterruptedException {
+    try {
+      timedCall(
+          new CallRunner1<Void>() {
+            @Override
+            public Void call() throws StreamingException, InterruptedException {
+              txnBatch.abort();
+              LOG.info("Aborted txn " + txnBatch.getCurrentTxnId());
+              return null;
+            }
+          }
+      );
+    } catch (StreamingException e) {
+      LOG.warn("Unable to abort transaction " + txnBatch.getCurrentTxnId(), e);
+      // continue to attempt to abort other txns in the batch
+    }
+  }
+
+  private boolean isClosed(TransactionBatch.TxnState txnState) {
+    if (txnState == TransactionBatch.TxnState.COMMITTED) {
+      return true;
+    }
+    if (txnState == TransactionBatch.TxnState.ABORTED) {
+      return true;
+    }
+    return false;
+  }
+
+  public void closeConnection() throws InterruptedException {
+    LOG.info("Closing connection to EndPoint : {}", endPoint);
+    try {
+      timedCall(new CallRunner1<Void>() {
+        @Override
+        public Void call() {
+          connection.close(); // could block
+          return null;
+        }
+      });
+      sinkCounter.incrementConnectionClosedCount();
+    } catch (Exception e) {
+      LOG.warn("Error closing connection to EndPoint : " + endPoint, e);
+      // Suppressing exceptions as we don't care for errors on connection close
+    }
+  }
+
+  private void commitTxn() throws CommitException, InterruptedException {
+    if (LOG.isInfoEnabled()) {
+      LOG.info("Committing Txn " + txnBatch.getCurrentTxnId() + " on EndPoint: " + endPoint);
+    }
+    try {
+      timedCall(new CallRunner1<Void>() {
+        @Override
+        public Void call() throws StreamingException, InterruptedException {
+          txnBatch.commit(); // could block
+          return null;
+        }
+      });
+    } catch (Exception e) {
+      throw new CommitException(endPoint, txnBatch.getCurrentTxnId(), e);
+    }
+  }
+
+  private void abortTxn() throws InterruptedException {
+    LOG.info("Aborting Txn id {} on End Point {}", txnBatch.getCurrentTxnId(), endPoint);
+    try {
+      timedCall(new CallRunner1<Void>() {
+        @Override
+        public Void call() throws StreamingException, InterruptedException {
+          txnBatch.abort(); // could block
+          return null;
+        }
+      });
+    } catch (InterruptedException e) {
+      throw e;
+    } catch (TimeoutException e) {
+      LOG.warn("Timeout while aborting Txn " + txnBatch.getCurrentTxnId() +
+               " on EndPoint: " + endPoint, e);
+    } catch (Exception e) {
+      LOG.warn("Error aborting Txn " + txnBatch.getCurrentTxnId() + " on EndPoint: " + endPoint, e);
+      // Suppressing exceptions as we don't care for errors on abort
+    }
+  }
+
+  private StreamingConnection newConnection(final String proxyUser)
+      throws InterruptedException, ConnectException {
+    try {
+      return timedCall(new CallRunner1<StreamingConnection>() {
+        @Override
+        public StreamingConnection call() throws InterruptedException, StreamingException {
+          return endPoint.newConnection(autoCreatePartitions); // could block
+        }
+      });
+    } catch (Exception e) {
+      throw new ConnectException(endPoint, e);
+    }
+  }
+
+  private TransactionBatch nextTxnBatch(final RecordWriter recordWriter)
+      throws InterruptedException, TxnBatchException {
+    LOG.debug("Fetching new Txn Batch for {}", endPoint);
+    TransactionBatch batch = null;
+    try {
+      batch = timedCall(new CallRunner1<TransactionBatch>() {
+        @Override
+        public TransactionBatch call() throws InterruptedException, StreamingException {
+          return connection.fetchTransactionBatch(txnsPerBatch, recordWriter); // could block
+        }
+      });
+      LOG.info("Acquired Transaction batch {}", batch);
+    } catch (Exception e) {
+      throw new TxnBatchException(endPoint, e);
+    }
+    return batch;
+  }
+
+  private void closeTxnBatch() throws InterruptedException {
+    try {
+      LOG.info("Closing Txn Batch {}.", txnBatch);
+      timedCall(new CallRunner1<Void>() {
+        @Override
+        public Void call() throws InterruptedException, StreamingException {
+          txnBatch.close(); // could block
+          return null;
+        }
+      });
+    } catch (InterruptedException e) {
+      throw e;
+    } catch (Exception e) {
+      LOG.warn("Error closing Txn Batch " + txnBatch, e);
+      // Suppressing exceptions as we don't care for errors on batch close
+    }
+  }
+
+  private <T> T timedCall(final CallRunner1<T> callRunner)
+      throws TimeoutException, InterruptedException, StreamingException {
+    Future<T> future = callTimeoutPool.submit(new Callable<T>() {
+      @Override
+      public T call() throws StreamingException, InterruptedException, Failure {
+        return callRunner.call();
+      }
+    });
+
+    try {
+      if (callTimeout > 0) {
+        return future.get(callTimeout, TimeUnit.MILLISECONDS);
+      } else {
+        return future.get();
+      }
+    } catch (TimeoutException eT) {
+      future.cancel(true);
+      sinkCounter.incrementConnectionFailedCount();
+      throw eT;
+    } catch (ExecutionException e1) {
+      sinkCounter.incrementConnectionFailedCount();
+      Throwable cause = e1.getCause();
+      if (cause instanceof IOException) {
+        throw new StreamingException("I/O Failure", (IOException) cause);
+      } else if (cause instanceof StreamingException) {
+        throw (StreamingException) cause;
+      } else if (cause instanceof TimeoutException) {
+        throw new StreamingException("Operation Timed Out.", (TimeoutException) cause);
+      } else if (cause instanceof RuntimeException) {
+        throw (RuntimeException) cause;
+      } else if (cause instanceof InterruptedException) {
+        throw (InterruptedException) cause;
+      }
+      throw new StreamingException(e1.getMessage(), e1);
+    }
+  }
+
+  long getLastUsed() {
+    return lastUsed;
+  }
+
+  /**
+   * Simple interface whose <tt>call</tt> method is called by
+   * {#callWithTimeout} in a new thread inside a
+   * {@linkplain java.security.PrivilegedExceptionAction#run()} call.
+   * @param <T>
+   */
+  private interface CallRunner<T> {
+    T call() throws Exception;
+  }
+
+  private interface CallRunner1<T> {
+    T call() throws StreamingException, InterruptedException, Failure;
+  }
+
+  public static class Failure extends Exception {
+    public Failure(String msg, Throwable cause) {
+      super(msg, cause);
+    }
+  }
+
+  public static class WriteException extends Failure {
+    public WriteException(HiveEndPoint endPoint, Long currentTxnId, Throwable cause) {
+      super("Failed writing to : " + endPoint + ". TxnID : " + currentTxnId, cause);
+    }
+  }
+
+  public static class CommitException extends Failure {
+    public CommitException(HiveEndPoint endPoint, Long txnID, Throwable cause) {
+      super("Commit of Txn " + txnID + " failed on EndPoint: " + endPoint, cause);
+    }
+  }
+
+  public static class ConnectException extends Failure {
+    public ConnectException(HiveEndPoint ep, Throwable cause) {
+      super("Failed connecting to EndPoint " + ep, cause);
+    }
+  }
+
+  public static class TxnBatchException extends Failure {
+    public TxnBatchException(HiveEndPoint ep, Throwable cause) {
+      super("Failed acquiring Transaction Batch from EndPoint: " + ep, cause);
+    }
+  }
+
+  private class TxnFailure extends Failure {
+    public TxnFailure(TransactionBatch txnBatch, Throwable cause) {
+      super("Failed switching to next Txn in TxnBatch " + txnBatch, cause);
+    }
+  }
+}
diff --git a/code/flume-ng-sinks/flume-hive-sink/src/test/java/org/apache/flume/sink/hive/TestHiveSink.java b/code/flume-ng-sinks/flume-hive-sink/src/test/java/org/apache/flume/sink/hive/TestHiveSink.java
new file mode 100644
index 0000000..c417404
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hive-sink/src/test/java/org/apache/flume/sink/hive/TestHiveSink.java
@@ -0,0 +1,423 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+
+package org.apache.flume.sink.hive;
+
+import com.google.common.collect.Lists;
+import junit.framework.Assert;
+import org.apache.flume.Channel;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.EventDeliveryException;
+import org.apache.flume.Transaction;
+import org.apache.flume.channel.MemoryChannel;
+import org.apache.flume.conf.Configurables;
+import org.apache.flume.event.SimpleEvent;
+import org.apache.flume.instrumentation.SinkCounter;
+import org.apache.hadoop.hive.cli.CliSessionState;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.api.MetaException;
+import org.apache.hadoop.hive.metastore.txn.TxnDbUtil;
+import org.apache.hadoop.hive.ql.CommandNeedRetryException;
+import org.apache.hadoop.hive.ql.Driver;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+import org.apache.hadoop.hive.ql.session.SessionState;
+import org.junit.After;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Calendar;
+import java.util.List;
+import java.util.UUID;
+
+public class TestHiveSink {
+  // 1)  partitioned table
+  static final String dbName = "testing";
+  static final String tblName = "alerts";
+
+  public static final String PART1_NAME = "continent";
+  public static final String PART2_NAME = "country";
+  public static final String[] partNames = { PART1_NAME, PART2_NAME };
+
+  private static final String COL1 = "id";
+  private static final String COL2 = "msg";
+  final String[] colNames = {COL1,COL2};
+  private String[] colTypes = { "int", "string" };
+
+  private static final String PART1_VALUE = "Asia";
+  private static final String PART2_VALUE = "India";
+  private final ArrayList<String> partitionVals;
+
+  // 2) un-partitioned table
+  static final String dbName2 = "testing2";
+  static final String tblName2 = "alerts2";
+  final String[] colNames2 = {COL1,COL2};
+  private String[] colTypes2 = { "int", "string" };
+
+  HiveSink sink = new HiveSink();
+
+  private final HiveConf conf;
+
+  private final Driver driver;
+
+  final String metaStoreURI;
+
+  @Rule
+  public TemporaryFolder dbFolder = new TemporaryFolder();
+
+  private static final Logger LOG = LoggerFactory.getLogger(HiveSink.class);
+
+  public TestHiveSink() throws Exception {
+    partitionVals = new ArrayList<String>(2);
+    partitionVals.add(PART1_VALUE);
+    partitionVals.add(PART2_VALUE);
+
+    metaStoreURI = "null";
+
+    conf = new HiveConf(this.getClass());
+    TestUtil.setConfValues(conf);
+
+    // 1) prepare hive
+    TxnDbUtil.cleanDb();
+    TxnDbUtil.prepDb();
+
+    // 2) Setup Hive client
+    SessionState.start(new CliSessionState(conf));
+    driver = new Driver(conf);
+
+  }
+
+
+  @Before
+  public void setUp() throws Exception {
+    TestUtil.dropDB(conf, dbName);
+
+    sink = new HiveSink();
+    sink.setName("HiveSink-" + UUID.randomUUID().toString());
+
+    String dbLocation = dbFolder.newFolder(dbName).getCanonicalPath() + ".db";
+    dbLocation = dbLocation.replaceAll("\\\\","/"); // for windows paths
+    TestUtil.createDbAndTable(driver, dbName, tblName, partitionVals, colNames,
+            colTypes, partNames, dbLocation);
+  }
+
+  @After
+  public void tearDown() throws MetaException, HiveException {
+    TestUtil.dropDB(conf, dbName);
+  }
+
+
+  @Test
+  public void testSingleWriterSimplePartitionedTable()
+          throws EventDeliveryException, IOException, CommandNeedRetryException {
+    int totalRecords = 4;
+    int batchSize = 2;
+    int batchCount = totalRecords / batchSize;
+
+    Context context = new Context();
+    context.put("hive.metastore", metaStoreURI);
+    context.put("hive.database",dbName);
+    context.put("hive.table",tblName);
+    context.put("hive.partition", PART1_VALUE + "," + PART2_VALUE);
+    context.put("autoCreatePartitions","false");
+    context.put("batchSize","" + batchSize);
+    context.put("serializer", HiveDelimitedTextSerializer.ALIAS);
+    context.put("serializer.fieldnames", COL1 + ",," + COL2 + ",");
+    context.put("heartBeatInterval", "0");
+
+    Channel channel = startSink(sink, context);
+
+    List<String> bodies = Lists.newArrayList();
+
+    // push the events in two batches
+    Transaction txn = channel.getTransaction();
+    txn.begin();
+    for (int j = 1; j <= totalRecords; j++) {
+      Event event = new SimpleEvent();
+      String body = j + ",blah,This is a log message,other stuff";
+      event.setBody(body.getBytes());
+      bodies.add(body);
+      channel.put(event);
+    }
+    // execute sink to process the events
+    txn.commit();
+    txn.close();
+
+
+    checkRecordCountInTable(0, dbName, tblName);
+    for (int i = 0; i < batchCount ; i++) {
+      sink.process();
+    }
+    sink.stop();
+    checkRecordCountInTable(totalRecords, dbName, tblName);
+  }
+
+  @Test
+  public void testSingleWriterSimpleUnPartitionedTable()
+          throws Exception {
+    TestUtil.dropDB(conf, dbName2);
+    String dbLocation = dbFolder.newFolder(dbName2).getCanonicalPath() + ".db";
+    dbLocation = dbLocation.replaceAll("\\\\","/"); // for windows paths
+    TestUtil.createDbAndTable(driver, dbName2, tblName2, null, colNames2, colTypes2,
+                              null, dbLocation);
+
+    try {
+      int totalRecords = 4;
+      int batchSize = 2;
+      int batchCount = totalRecords / batchSize;
+
+      Context context = new Context();
+      context.put("hive.metastore", metaStoreURI);
+      context.put("hive.database", dbName2);
+      context.put("hive.table", tblName2);
+      context.put("autoCreatePartitions","false");
+      context.put("batchSize","" + batchSize);
+      context.put("serializer", HiveDelimitedTextSerializer.ALIAS);
+      context.put("serializer.fieldnames", COL1 + ",," + COL2 + ",");
+      context.put("heartBeatInterval", "0");
+
+      Channel channel = startSink(sink, context);
+
+      List<String> bodies = Lists.newArrayList();
+
+      // Push the events in two batches
+      Transaction txn = channel.getTransaction();
+      txn.begin();
+      for (int j = 1; j <= totalRecords; j++) {
+        Event event = new SimpleEvent();
+        String body = j + ",blah,This is a log message,other stuff";
+        event.setBody(body.getBytes());
+        bodies.add(body);
+        channel.put(event);
+      }
+
+      txn.commit();
+      txn.close();
+
+      checkRecordCountInTable(0, dbName2, tblName2);
+      for (int i = 0; i < batchCount ; i++) {
+        sink.process();
+      }
+
+      // check before & after  stopping sink
+      checkRecordCountInTable(totalRecords, dbName2, tblName2);
+      sink.stop();
+      checkRecordCountInTable(totalRecords, dbName2, tblName2);
+    } finally {
+      TestUtil.dropDB(conf, dbName2);
+    }
+  }
+
+  @Test
+  public void testSingleWriterUseHeaders()
+          throws Exception {
+    String[] colNames = {COL1, COL2};
+    String PART1_NAME = "country";
+    String PART2_NAME = "hour";
+    String[] partNames = {PART1_NAME, PART2_NAME};
+    List<String> partitionVals = null;
+    String PART1_VALUE = "%{" + PART1_NAME + "}";
+    String PART2_VALUE = "%y-%m-%d-%k";
+    partitionVals = new ArrayList<String>(2);
+    partitionVals.add(PART1_VALUE);
+    partitionVals.add(PART2_VALUE);
+
+    String tblName = "hourlydata";
+    TestUtil.dropDB(conf, dbName2);
+    String dbLocation = dbFolder.newFolder(dbName2).getCanonicalPath() + ".db";
+    dbLocation = dbLocation.replaceAll("\\\\","/"); // for windows paths
+    TestUtil.createDbAndTable(driver, dbName2, tblName, partitionVals, colNames,
+            colTypes, partNames, dbLocation);
+
+    int totalRecords = 4;
+    int batchSize = 2;
+    int batchCount = totalRecords / batchSize;
+
+    Context context = new Context();
+    context.put("hive.metastore",metaStoreURI);
+    context.put("hive.database",dbName2);
+    context.put("hive.table",tblName);
+    context.put("hive.partition", PART1_VALUE + "," + PART2_VALUE);
+    context.put("autoCreatePartitions","true");
+    context.put("useLocalTimeStamp", "false");
+    context.put("batchSize","" + batchSize);
+    context.put("serializer", HiveDelimitedTextSerializer.ALIAS);
+    context.put("serializer.fieldnames", COL1 + ",," + COL2 + ",");
+    context.put("heartBeatInterval", "0");
+
+    Channel channel = startSink(sink, context);
+
+    Calendar eventDate = Calendar.getInstance();
+    List<String> bodies = Lists.newArrayList();
+
+    // push events in two batches - two per batch. each batch is diff hour
+    Transaction txn = channel.getTransaction();
+    txn.begin();
+    for (int j = 1; j <= totalRecords; j++) {
+      Event event = new SimpleEvent();
+      String body = j + ",blah,This is a log message,other stuff";
+      event.setBody(body.getBytes());
+      eventDate.clear();
+      eventDate.set(2014, 03, 03, j % batchCount, 1); // yy mm dd hh mm
+      event.getHeaders().put( "timestamp",
+              String.valueOf(eventDate.getTimeInMillis()) );
+      event.getHeaders().put( PART1_NAME, "Asia" );
+      bodies.add(body);
+      channel.put(event);
+    }
+    // execute sink to process the events
+    txn.commit();
+    txn.close();
+
+    checkRecordCountInTable(0, dbName2, tblName);
+    for (int i = 0; i < batchCount ; i++) {
+      sink.process();
+    }
+    checkRecordCountInTable(totalRecords, dbName2, tblName);
+    sink.stop();
+
+    // verify counters
+    SinkCounter counter = sink.getCounter();
+    Assert.assertEquals(2, counter.getConnectionCreatedCount());
+    Assert.assertEquals(2, counter.getConnectionClosedCount());
+    Assert.assertEquals(2, counter.getBatchCompleteCount());
+    Assert.assertEquals(0, counter.getBatchEmptyCount());
+    Assert.assertEquals(0, counter.getConnectionFailedCount() );
+    Assert.assertEquals(4, counter.getEventDrainAttemptCount());
+    Assert.assertEquals(4, counter.getEventDrainSuccessCount() );
+
+  }
+
+  @Test
+  public void testHeartBeat()
+          throws EventDeliveryException, IOException, CommandNeedRetryException {
+    int batchSize = 2;
+    int batchCount = 3;
+    int totalRecords = batchCount * batchSize;
+    Context context = new Context();
+    context.put("hive.metastore", metaStoreURI);
+    context.put("hive.database", dbName);
+    context.put("hive.table", tblName);
+    context.put("hive.partition", PART1_VALUE + "," + PART2_VALUE);
+    context.put("autoCreatePartitions","true");
+    context.put("batchSize","" + batchSize);
+    context.put("serializer", HiveDelimitedTextSerializer.ALIAS);
+    context.put("serializer.fieldnames", COL1 + ",," + COL2 + ",");
+    context.put("hive.txnsPerBatchAsk", "20");
+    context.put("heartBeatInterval", "3"); // heartbeat in seconds
+
+    Channel channel = startSink(sink, context);
+
+    List<String> bodies = Lists.newArrayList();
+
+    // push the events in two batches
+    for (int i = 0; i < batchCount; i++) {
+      Transaction txn = channel.getTransaction();
+      txn.begin();
+      for (int j = 1; j <= batchSize; j++) {
+        Event event = new SimpleEvent();
+        String body = i * j + ",blah,This is a log message,other stuff";
+        event.setBody(body.getBytes());
+        bodies.add(body);
+        channel.put(event);
+      }
+      // execute sink to process the events
+      txn.commit();
+      txn.close();
+
+      sink.process();
+      sleep(3000); // allow heartbeat to happen
+    }
+
+    sink.stop();
+    checkRecordCountInTable(totalRecords, dbName, tblName);
+  }
+
+  @Test
+  public void testJsonSerializer() throws Exception {
+    int batchSize = 2;
+    int batchCount = 2;
+    int totalRecords = batchCount * batchSize;
+    Context context = new Context();
+    context.put("hive.metastore",metaStoreURI);
+    context.put("hive.database",dbName);
+    context.put("hive.table",tblName);
+    context.put("hive.partition", PART1_VALUE + "," + PART2_VALUE);
+    context.put("autoCreatePartitions","true");
+    context.put("batchSize","" + batchSize);
+    context.put("serializer", HiveJsonSerializer.ALIAS);
+    context.put("serializer.fieldnames", COL1 + ",," + COL2 + ",");
+    context.put("heartBeatInterval", "0");
+
+    Channel channel = startSink(sink, context);
+
+    List<String> bodies = Lists.newArrayList();
+
+    // push the events in two batches
+    for (int i = 0; i < batchCount; i++) {
+      Transaction txn = channel.getTransaction();
+      txn.begin();
+      for (int j = 1; j <= batchSize; j++) {
+        Event event = new SimpleEvent();
+        String body = "{\"id\" : 1, \"msg\" : \"using json serializer\"}";
+        event.setBody(body.getBytes());
+        bodies.add(body);
+        channel.put(event);
+      }
+      // execute sink to process the events
+      txn.commit();
+      txn.close();
+
+      sink.process();
+    }
+    checkRecordCountInTable(totalRecords, dbName, tblName);
+    sink.stop();
+    checkRecordCountInTable(totalRecords, dbName, tblName);
+  }
+
+  private void sleep(int n) {
+    try {
+      Thread.sleep(n);
+    } catch (InterruptedException e) {
+    }
+  }
+
+  private static Channel startSink(HiveSink sink, Context context) {
+    Configurables.configure(sink, context);
+
+    Channel channel = new MemoryChannel();
+    Configurables.configure(channel, context);
+    sink.setChannel(channel);
+    sink.start();
+    return channel;
+  }
+
+  private void checkRecordCountInTable(int expectedCount, String db, String tbl)
+          throws CommandNeedRetryException, IOException {
+    int count = TestUtil.listRecordsInTable(driver, db, tbl).size();
+    Assert.assertEquals(expectedCount, count);
+  }
+}
diff --git a/code/flume-ng-sinks/flume-hive-sink/src/test/java/org/apache/flume/sink/hive/TestHiveWriter.java b/code/flume-ng-sinks/flume-hive-sink/src/test/java/org/apache/flume/sink/hive/TestHiveWriter.java
new file mode 100644
index 0000000..4d7c9bb
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hive-sink/src/test/java/org/apache/flume/sink/hive/TestHiveWriter.java
@@ -0,0 +1,351 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.flume.sink.hive;
+
+import com.google.common.util.concurrent.ThreadFactoryBuilder;
+import junit.framework.Assert;
+import org.apache.flume.Context;
+import org.apache.flume.event.SimpleEvent;
+import org.apache.flume.instrumentation.SinkCounter;
+import org.apache.hadoop.hive.cli.CliSessionState;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.txn.TxnDbUtil;
+import org.apache.hadoop.hive.ql.CommandNeedRetryException;
+import org.apache.hadoop.hive.ql.Driver;
+import org.apache.hadoop.hive.ql.session.SessionState;
+import org.apache.hive.hcatalog.streaming.HiveEndPoint;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+
+public class TestHiveWriter {
+  static final String dbName = "testing";
+  static final String tblName = "alerts";
+
+  public static final String PART1_NAME = "continent";
+  public static final String PART2_NAME = "country";
+  public static final String[] partNames = { PART1_NAME, PART2_NAME };
+
+  private static final String COL1 = "id";
+  private static final String COL2 = "msg";
+  final String[] colNames = {COL1,COL2};
+  private String[] colTypes = { "int", "string" };
+
+  private static final String PART1_VALUE = "Asia";
+  private static final String PART2_VALUE = "India";
+  private final ArrayList<String> partVals;
+
+  private final String metaStoreURI;
+
+  private HiveDelimitedTextSerializer serializer;
+
+  private final HiveConf conf;
+
+  private ExecutorService callTimeoutPool;
+  int timeout = 10000; // msec
+
+  @Rule
+  public TemporaryFolder dbFolder = new TemporaryFolder();
+
+  private final Driver driver;
+
+  public TestHiveWriter() throws Exception {
+    partVals = new ArrayList<String>(2);
+    partVals.add(PART1_VALUE);
+    partVals.add(PART2_VALUE);
+
+    metaStoreURI = null;
+
+    int callTimeoutPoolSize = 1;
+    callTimeoutPool = Executors.newFixedThreadPool(callTimeoutPoolSize,
+            new ThreadFactoryBuilder().setNameFormat("hiveWriterTest").build());
+
+    // 1) Start metastore
+    conf = new HiveConf(this.getClass());
+    TestUtil.setConfValues(conf);
+    if (metaStoreURI != null) {
+      conf.setVar(HiveConf.ConfVars.METASTOREURIS, metaStoreURI);
+    }
+
+    // 2) Setup Hive client
+    SessionState.start(new CliSessionState(conf));
+    driver = new Driver(conf);
+
+  }
+
+  @Before
+  public void setUp() throws Exception {
+    // 1) prepare hive
+    TxnDbUtil.cleanDb();
+    TxnDbUtil.prepDb();
+
+    // 1) Setup tables
+    TestUtil.dropDB(conf, dbName);
+    String dbLocation = dbFolder.newFolder(dbName).getCanonicalPath() + ".db";
+    dbLocation = dbLocation.replaceAll("\\\\","/"); // for windows paths
+    TestUtil.createDbAndTable(driver, dbName, tblName, partVals, colNames, colTypes, partNames,
+                              dbLocation);
+
+    // 2) Setup serializer
+    Context ctx = new Context();
+    ctx.put("serializer.fieldnames", COL1 + ",," + COL2 + ",");
+    serializer = new HiveDelimitedTextSerializer();
+    serializer.configure(ctx);
+  }
+
+  @Test
+  public void testInstantiate() throws Exception {
+    HiveEndPoint endPoint = new HiveEndPoint(metaStoreURI, dbName, tblName, partVals);
+    SinkCounter sinkCounter = new SinkCounter(this.getClass().getName());
+    HiveWriter writer = new HiveWriter(endPoint, 10, true, timeout, callTimeoutPool, "flumetest",
+                                       serializer, sinkCounter);
+
+    writer.close();
+  }
+
+  @Test
+  public void testWriteBasic() throws Exception {
+    HiveEndPoint endPoint = new HiveEndPoint(metaStoreURI, dbName, tblName, partVals);
+    SinkCounter sinkCounter = new SinkCounter(this.getClass().getName());
+    HiveWriter writer = new HiveWriter(endPoint, 10, true, timeout, callTimeoutPool, "flumetest",
+                                       serializer, sinkCounter);
+
+    writeEvents(writer,3);
+    writer.flush(false);
+    writer.close();
+    checkRecordCountInTable(3);
+  }
+
+  @Test
+  public void testWriteMultiFlush() throws Exception {
+    HiveEndPoint endPoint = new HiveEndPoint(metaStoreURI, dbName, tblName, partVals);
+    SinkCounter sinkCounter = new SinkCounter(this.getClass().getName());
+
+    HiveWriter writer = new HiveWriter(endPoint, 10, true, timeout, callTimeoutPool, "flumetest",
+                                       serializer, sinkCounter);
+
+    checkRecordCountInTable(0);
+    SimpleEvent event = new SimpleEvent();
+
+    String REC1 = "1,xyz,Hello world,abc";
+    event.setBody(REC1.getBytes());
+    writer.write(event);
+    checkRecordCountInTable(0);
+    writer.flush(true);
+    checkRecordCountInTable(1);
+
+    String REC2 = "2,xyz,Hello world,abc";
+    event.setBody(REC2.getBytes());
+    writer.write(event);
+    checkRecordCountInTable(1);
+    writer.flush(true);
+    checkRecordCountInTable(2);
+
+    String REC3 = "3,xyz,Hello world,abc";
+    event.setBody(REC3.getBytes());
+    writer.write(event);
+    writer.flush(true);
+    checkRecordCountInTable(3);
+    writer.close();
+
+    checkRecordCountInTable(3);
+  }
+
+  @Test
+  public void testTxnBatchConsumption() throws Exception {
+    // get a small txn batch and consume it, then roll to new batch, very
+    // the number of remaining txns to ensure Txns are not accidentally skipped
+
+    HiveEndPoint endPoint = new HiveEndPoint(metaStoreURI, dbName, tblName, partVals);
+    SinkCounter sinkCounter = new SinkCounter(this.getClass().getName());
+
+    int txnPerBatch = 3;
+
+    HiveWriter writer = new HiveWriter(endPoint, txnPerBatch, true, timeout, callTimeoutPool,
+                                       "flumetest", serializer, sinkCounter);
+
+    Assert.assertEquals(writer.getRemainingTxns(),2);
+    writer.flush(true);
+
+    Assert.assertEquals(writer.getRemainingTxns(), 1);
+    writer.flush(true);
+
+    Assert.assertEquals(writer.getRemainingTxns(), 0);
+    writer.flush(true);
+
+    // flip over to next batch
+    Assert.assertEquals(writer.getRemainingTxns(), 2);
+    writer.flush(true);
+
+    Assert.assertEquals(writer.getRemainingTxns(), 1);
+
+    writer.close();
+
+  }
+
+  private void checkRecordCountInTable(int expectedCount)
+          throws CommandNeedRetryException, IOException {
+    int count = TestUtil.listRecordsInTable(driver, dbName, tblName).size();
+    Assert.assertEquals(expectedCount, count);
+  }
+
+  /**
+   * Sets up input fields to have same order as table columns,
+   * Also sets the separator on serde to be same as i/p field separator
+   * @throws Exception
+   */
+  @Test
+  public void testInOrderWrite() throws Exception {
+    HiveEndPoint endPoint = new HiveEndPoint(metaStoreURI, dbName, tblName, partVals);
+    SinkCounter sinkCounter = new SinkCounter(this.getClass().getName());
+    int timeout = 5000; // msec
+
+    HiveDelimitedTextSerializer serializer2 = new HiveDelimitedTextSerializer();
+    Context ctx = new Context();
+    ctx.put("serializer.fieldnames", COL1 + "," + COL2);
+    ctx.put("serializer.serdeSeparator", ",");
+    serializer2.configure(ctx);
+
+
+    HiveWriter writer = new HiveWriter(endPoint, 10, true, timeout, callTimeoutPool,
+            "flumetest", serializer2, sinkCounter);
+
+    SimpleEvent event = new SimpleEvent();
+    event.setBody("1,Hello world 1".getBytes());
+    writer.write(event);
+    event.setBody("2,Hello world 2".getBytes());
+    writer.write(event);
+    event.setBody("3,Hello world 3".getBytes());
+    writer.write(event);
+    writer.flush(false);
+    writer.close();
+  }
+
+  @Test
+  public void testSerdeSeparatorCharParsing() throws Exception {
+    HiveEndPoint endPoint = new HiveEndPoint(metaStoreURI, dbName, tblName, partVals);
+    SinkCounter sinkCounter = new SinkCounter(this.getClass().getName());
+    int timeout = 10000; // msec
+
+    // 1)  single character serdeSeparator
+    HiveDelimitedTextSerializer serializer1 = new HiveDelimitedTextSerializer();
+    Context ctx = new Context();
+    ctx.put("serializer.fieldnames", COL1 + "," + COL2);
+    ctx.put("serializer.serdeSeparator", ",");
+    serializer1.configure(ctx);
+    // show not throw
+
+
+    // 2) special character as serdeSeparator
+    HiveDelimitedTextSerializer serializer2 = new HiveDelimitedTextSerializer();
+    ctx = new Context();
+    ctx.put("serializer.fieldnames", COL1 + "," + COL2);
+    ctx.put("serializer.serdeSeparator", "'\t'");
+    serializer2.configure(ctx);
+    // show not throw
+
+
+    // 2) bad spec as serdeSeparator
+    HiveDelimitedTextSerializer serializer3 = new HiveDelimitedTextSerializer();
+    ctx = new Context();
+    ctx.put("serializer.fieldnames", COL1 + "," + COL2);
+    ctx.put("serializer.serdeSeparator", "ab");
+    try {
+      serializer3.configure(ctx);
+      Assert.assertTrue("Bad serdeSeparator character was accepted", false);
+    } catch (Exception e) {
+      // expect an exception
+    }
+
+  }
+
+  @Test
+  public void testSecondWriterBeforeFirstCommits() throws Exception {
+    // here we open a new writer while the first is still writing (not committed)
+    HiveEndPoint endPoint1 = new HiveEndPoint(metaStoreURI, dbName, tblName, partVals);
+    ArrayList<String> partVals2 = new ArrayList<String>(2);
+    partVals2.add(PART1_VALUE);
+    partVals2.add("Nepal");
+    HiveEndPoint endPoint2 = new HiveEndPoint(metaStoreURI, dbName, tblName, partVals2);
+
+    SinkCounter sinkCounter1 = new SinkCounter(this.getClass().getName());
+    SinkCounter sinkCounter2 = new SinkCounter(this.getClass().getName());
+
+    HiveWriter writer1 = new HiveWriter(endPoint1, 10, true, timeout, callTimeoutPool, "flumetest",
+                                        serializer, sinkCounter1);
+
+    writeEvents(writer1, 3);
+
+    HiveWriter writer2 = new HiveWriter(endPoint2, 10, true, timeout, callTimeoutPool, "flumetest",
+                                        serializer, sinkCounter2);
+    writeEvents(writer2, 3);
+    writer2.flush(false); // commit
+
+    writer1.flush(false); // commit
+    writer1.close();
+
+    writer2.close();
+  }
+
+  @Test
+  public void testSecondWriterAfterFirstCommits() throws Exception {
+    // here we open a new writer after the first writer has committed one txn
+    HiveEndPoint endPoint1 = new HiveEndPoint(metaStoreURI, dbName, tblName, partVals);
+    ArrayList<String> partVals2 = new ArrayList<String>(2);
+    partVals2.add(PART1_VALUE);
+    partVals2.add("Nepal");
+    HiveEndPoint endPoint2 = new HiveEndPoint(metaStoreURI, dbName, tblName, partVals2);
+
+    SinkCounter sinkCounter1 = new SinkCounter(this.getClass().getName());
+    SinkCounter sinkCounter2 = new SinkCounter(this.getClass().getName());
+
+    HiveWriter writer1 = new HiveWriter(endPoint1, 10, true, timeout, callTimeoutPool, "flumetest",
+                                        serializer, sinkCounter1);
+
+    writeEvents(writer1, 3);
+
+    writer1.flush(false); // commit
+
+
+    HiveWriter writer2 = new HiveWriter(endPoint2, 10, true, timeout, callTimeoutPool, "flumetest",
+                                        serializer, sinkCounter2);
+    writeEvents(writer2, 3);
+    writer2.flush(false); // commit
+
+
+    writer1.close();
+    writer2.close();
+  }
+
+  private void writeEvents(HiveWriter writer, int count)
+      throws InterruptedException, HiveWriter.WriteException {
+    SimpleEvent event = new SimpleEvent();
+    for (int i = 1; i <= count; i++) {
+      event.setBody((i + ",xyz,Hello world,abc").getBytes());
+      writer.write(event);
+    }
+  }
+}
diff --git a/code/flume-ng-sinks/flume-hive-sink/src/test/java/org/apache/flume/sink/hive/TestUtil.java b/code/flume-ng-sinks/flume-hive-sink/src/test/java/org/apache/flume/sink/hive/TestUtil.java
new file mode 100644
index 0000000..1fcb4eb
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hive-sink/src/test/java/org/apache/flume/sink/hive/TestUtil.java
@@ -0,0 +1,233 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+
+package org.apache.flume.sink.hive;
+
+import org.apache.hadoop.fs.FileStatus;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.RawLocalFileSystem;
+import org.apache.hadoop.fs.permission.FsPermission;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.HiveMetaStoreClient;
+import org.apache.hadoop.hive.metastore.IMetaStoreClient;
+import org.apache.hadoop.hive.metastore.api.MetaException;
+import org.apache.hadoop.hive.ql.CommandNeedRetryException;
+import org.apache.hadoop.hive.ql.Driver;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+import org.apache.hadoop.hive.shims.ShimLoader;
+import org.apache.hadoop.util.Shell;
+import org.apache.hive.hcatalog.streaming.QueryFailedException;
+import org.apache.thrift.TException;
+
+import java.io.File;
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.net.URI;
+import java.net.URISyntaxException;
+import java.util.ArrayList;
+import java.util.List;
+
+public class TestUtil {
+
+  private static final String txnMgr = "org.apache.hadoop.hive.ql.lockmgr.DbTxnManager";
+
+  /**
+   * Set up the configuration so it will use the DbTxnManager, concurrency will be set to true,
+   * and the JDBC configs will be set for putting the transaction and lock info in the embedded
+   * metastore.
+   * @param conf HiveConf to add these values to.
+   */
+  public static void setConfValues(HiveConf conf) {
+    conf.setVar(HiveConf.ConfVars.HIVE_TXN_MANAGER, txnMgr);
+    conf.setBoolVar(HiveConf.ConfVars.HIVE_SUPPORT_CONCURRENCY, true);
+    conf.set("fs.raw.impl", RawFileSystem.class.getName());
+  }
+
+  public static void createDbAndTable(Driver driver, String databaseName,
+                                      String tableName, List<String> partVals,
+                                      String[] colNames, String[] colTypes,
+                                      String[] partNames, String dbLocation)
+          throws Exception {
+    String dbUri = "raw://" + dbLocation;
+    String tableLoc = dbUri + Path.SEPARATOR + tableName;
+
+    runDDL(driver, "create database IF NOT EXISTS " + databaseName + " location '" + dbUri + "'");
+    runDDL(driver, "use " + databaseName);
+    String crtTbl = "create table " + tableName +
+            " ( " +  getTableColumnsStr(colNames,colTypes) + " )" +
+            getPartitionStmtStr(partNames) +
+            " clustered by ( " + colNames[0] + " )" +
+            " into 10 buckets " +
+            " stored as orc " +
+            " location '" + tableLoc +  "'" +
+            " TBLPROPERTIES ('transactional'='true')";
+
+    runDDL(driver, crtTbl);
+    System.out.println("crtTbl = " + crtTbl);
+    if (partNames != null && partNames.length != 0) {
+      String addPart = "alter table " + tableName + " add partition ( " +
+              getTablePartsStr2(partNames, partVals) + " )";
+      runDDL(driver, addPart);
+    }
+  }
+
+  private static String getPartitionStmtStr(String[] partNames) {
+    if ( partNames == null || partNames.length == 0) {
+      return "";
+    }
+    return " partitioned by (" + getTablePartsStr(partNames) + " )";
+  }
+
+  // delete db and all tables in it
+  public static void dropDB(HiveConf conf, String databaseName)
+      throws HiveException, MetaException {
+    IMetaStoreClient client = new HiveMetaStoreClient(conf);
+    try {
+      for (String table : client.listTableNamesByFilter(databaseName, "", (short)-1)) {
+        client.dropTable(databaseName, table, true, true);
+      }
+      client.dropDatabase(databaseName);
+    } catch (TException e) {
+      client.close();
+    }
+  }
+
+  private static String getTableColumnsStr(String[] colNames, String[] colTypes) {
+    StringBuffer sb = new StringBuffer();
+    for (int i = 0; i < colNames.length; ++i) {
+      sb.append(colNames[i] + " " + colTypes[i]);
+      if (i < colNames.length - 1) {
+        sb.append(",");
+      }
+    }
+    return sb.toString();
+  }
+
+  // converts partNames into "partName1 string, partName2 string"
+  private static String getTablePartsStr(String[] partNames) {
+    if (partNames == null || partNames.length == 0) {
+      return "";
+    }
+    StringBuffer sb = new StringBuffer();
+    for (int i = 0; i < partNames.length; ++i) {
+      sb.append(partNames[i] + " string");
+      if (i < partNames.length - 1) {
+        sb.append(",");
+      }
+    }
+    return sb.toString();
+  }
+
+  // converts partNames,partVals into "partName1=val1, partName2=val2"
+  private static String getTablePartsStr2(String[] partNames, List<String> partVals) {
+    StringBuffer sb = new StringBuffer();
+    for (int i = 0; i < partVals.size(); ++i) {
+      sb.append(partNames[i] + " = '" + partVals.get(i) + "'");
+      if (i < partVals.size() - 1) {
+        sb.append(",");
+      }
+    }
+    return sb.toString();
+  }
+
+  public static ArrayList<String> listRecordsInTable(Driver driver, String dbName, String tblName)
+      throws CommandNeedRetryException, IOException {
+    driver.run("select * from " + dbName + "." + tblName);
+    ArrayList<String> res = new ArrayList<String>();
+    driver.getResults(res);
+    return res;
+  }
+
+  public static ArrayList<String> listRecordsInPartition(Driver driver, String dbName,
+                                                         String tblName, String continent,
+                                                         String country)
+      throws CommandNeedRetryException, IOException {
+    driver.run("select * from " + dbName + "." + tblName + " where continent='"
+            + continent + "' and country='" + country + "'");
+    ArrayList<String> res = new ArrayList<String>();
+    driver.getResults(res);
+    return res;
+  }
+
+  public static class RawFileSystem extends RawLocalFileSystem {
+    private static final URI NAME;
+
+    static {
+      try {
+        NAME = new URI("raw:///");
+      } catch (URISyntaxException se) {
+        throw new IllegalArgumentException("bad uri", se);
+      }
+    }
+
+    @Override
+    public URI getUri() {
+      return NAME;
+    }
+
+    static String execCommand(File f, String... cmd) throws IOException {
+      String[] args = new String[cmd.length + 1];
+      System.arraycopy(cmd, 0, args, 0, cmd.length);
+      args[cmd.length] = f.getCanonicalPath();
+      String output = Shell.execCommand(args);
+      return output;
+    }
+
+    @Override
+    public FileStatus getFileStatus(Path path) throws IOException {
+      File file = pathToFile(path);
+      if (!file.exists()) {
+        throw new FileNotFoundException("Can't find " + path);
+      }
+      // get close enough
+      short mod = 0;
+      if (file.canRead()) {
+        mod |= 0444;
+      }
+      if (file.canWrite()) {
+        mod |= 0200;
+      }
+      if (file.canExecute()) {
+        mod |= 0111;
+      }
+      ShimLoader.getHadoopShims();
+      return new FileStatus(file.length(), file.isDirectory(), 1, 1024,
+              file.lastModified(), file.lastModified(),
+              FsPermission.createImmutable(mod), "owen", "users", path);
+    }
+  }
+
+  private static boolean runDDL(Driver driver, String sql) throws QueryFailedException {
+    int retryCount = 1; // # of times to retry if first attempt fails
+    for (int attempt = 0; attempt <= retryCount; ++attempt) {
+      try {
+        driver.run(sql);
+        return true;
+      } catch (CommandNeedRetryException e) {
+        if (attempt == retryCount) {
+          throw new QueryFailedException(sql, e);
+        }
+        continue;
+      }
+    } // for
+    return false;
+  }
+
+}
diff --git a/code/flume-ng-sinks/flume-hive-sink/src/test/resources/log4j.properties b/code/flume-ng-sinks/flume-hive-sink/src/test/resources/log4j.properties
new file mode 100644
index 0000000..252b5ea
--- /dev/null
+++ b/code/flume-ng-sinks/flume-hive-sink/src/test/resources/log4j.properties
@@ -0,0 +1,26 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+log4j.rootLogger = INFO, out
+
+log4j.appender.out = org.apache.log4j.ConsoleAppender
+log4j.appender.out.layout = org.apache.log4j.PatternLayout
+log4j.appender.out.layout.ConversionPattern = %d (%t) [%p - %l] %m%n
+
+log4j.logger.org.apache.flume = DEBUG
+log4j.logger.org.apache.hadoop = WARN
+log4j.logger.org.mortbay = WARN
diff --git a/code/flume-ng-sinks/flume-irc-sink/pom.xml b/code/flume-ng-sinks/flume-irc-sink/pom.xml
new file mode 100644
index 0000000..6345f59
--- /dev/null
+++ b/code/flume-ng-sinks/flume-irc-sink/pom.xml
@@ -0,0 +1,83 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to You under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
+
+  <modelVersion>4.0.0</modelVersion>
+
+  <parent>
+    <artifactId>flume-ng-sinks</artifactId>
+    <groupId>org.apache.flume</groupId>
+    <version>1.7.0</version>
+  </parent>
+
+  <groupId>org.apache.flume.flume-ng-sinks</groupId>
+  <artifactId>flume-irc-sink</artifactId>
+  <name>Flume NG IRC Sink</name>
+
+  <build>
+    <plugins>
+      <plugin>
+        <groupId>org.apache.rat</groupId>
+        <artifactId>apache-rat-plugin</artifactId>
+      </plugin>
+    </plugins>
+  </build>
+
+  <dependencies>
+
+    <dependency>
+      <groupId>org.apache.flume</groupId>
+      <artifactId>flume-ng-sdk</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.flume</groupId>
+      <artifactId>flume-ng-configuration</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.flume</groupId>
+      <artifactId>flume-ng-core</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>org.slf4j</groupId>
+      <artifactId>slf4j-api</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>org.schwering</groupId>
+      <artifactId>irclib</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>junit</groupId>
+      <artifactId>junit</artifactId>
+      <scope>test</scope>
+    </dependency>
+
+    <dependency>
+      <groupId>org.slf4j</groupId>
+      <artifactId>slf4j-log4j12</artifactId>
+      <scope>test</scope>
+    </dependency>
+
+  </dependencies>
+
+</project>
diff --git a/code/flume-ng-sinks/flume-irc-sink/src/main/java/org/apache/flume/sink/irc/IRCSink.java b/code/flume-ng-sinks/flume-irc-sink/src/main/java/org/apache/flume/sink/irc/IRCSink.java
new file mode 100644
index 0000000..52bbfc8
--- /dev/null
+++ b/code/flume-ng-sinks/flume-irc-sink/src/main/java/org/apache/flume/sink/irc/IRCSink.java
@@ -0,0 +1,266 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flume.sink.irc;
+
+import java.io.IOException;
+
+import org.apache.flume.Channel;
+import org.apache.flume.ChannelException;
+import org.apache.flume.Context;
+import org.apache.flume.CounterGroup;
+import org.apache.flume.Event;
+import org.apache.flume.EventDeliveryException;
+import org.apache.flume.Transaction;
+import org.apache.flume.conf.Configurable;
+import org.apache.flume.sink.AbstractSink;
+import org.schwering.irc.lib.IRCConnection;
+import org.schwering.irc.lib.IRCEventListener;
+import org.schwering.irc.lib.IRCModeParser;
+import org.schwering.irc.lib.IRCUser;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import com.google.common.base.Preconditions;
+
+public class IRCSink extends AbstractSink implements Configurable {
+
+  private static final Logger logger = LoggerFactory.getLogger(IRCSink.class);
+
+  private static final int DEFAULT_PORT = 6667;
+  private static final String DEFAULT_SPLIT_CHARS = "\n";
+
+  private static final String IRC_CHANNEL_PREFIX = "#";
+
+  private IRCConnection connection = null;
+
+  private String hostname;
+  private Integer port;
+  private String nick;
+  private String password;
+  private String user;
+  private String name;
+  private String chan;
+  private Boolean splitLines;
+  private String splitChars;
+  
+  private CounterGroup counterGroup;
+
+  public static class IRCConnectionListener implements IRCEventListener {
+
+    public void onRegistered() {
+    }
+
+    public void onDisconnected() {
+      logger.error("IRC sink disconnected");
+    }
+
+    public void onError(String msg) {
+      logger.error("IRC sink error: {}", msg);
+    }
+
+    public void onError(int num, String msg) {
+      logger.error("IRC sink error: {} - {}", num, msg);
+    }
+
+    public void onInvite(String chan, IRCUser u, String nickPass) {
+    }
+
+    public void onJoin(String chan, IRCUser u) {
+    }
+
+    public void onKick(String chan, IRCUser u, String nickPass, String msg) {
+    }
+
+    public void onMode(IRCUser u, String nickPass, String mode) {
+    }
+
+    public void onMode(String chan, IRCUser u, IRCModeParser mp) {
+    }
+
+    public void onNick(IRCUser u, String nickNew) {
+    }
+
+    public void onNotice(String target, IRCUser u, String msg) {
+    }
+
+    public void onPart(String chan, IRCUser u, String msg) {
+    }
+
+    public void onPrivmsg(String chan, IRCUser u, String msg) {
+    }
+
+    public void onQuit(IRCUser u, String msg) {
+    }
+
+    public void onReply(int num, String value, String msg) {
+    }
+
+    public void onTopic(String chan, IRCUser u, String topic) {
+    }
+
+    public void onPing(String p) {
+    }
+
+    public void unknown(String a, String b, String c, String d) {
+    }
+  }
+
+  public IRCSink() {
+    counterGroup = new CounterGroup();
+  }
+
+  public void configure(Context context) {
+    hostname = context.getString("hostname");
+    String portStr = context.getString("port");
+    nick = context.getString("nick");
+    password = context.getString("password");
+    user = context.getString("user");
+    name = context.getString("name");
+    chan = context.getString("chan");
+    splitLines = context.getBoolean("splitlines", false);
+    splitChars = context.getString("splitchars");
+
+    if (portStr != null) {
+      port = Integer.parseInt(portStr);
+    } else {
+      port = DEFAULT_PORT;
+    }
+
+    if (splitChars == null) {
+      splitChars = DEFAULT_SPLIT_CHARS;
+    }
+    
+    Preconditions.checkState(hostname != null, "No hostname specified");
+    Preconditions.checkState(nick != null, "No nick specified");
+    Preconditions.checkState(chan != null, "No chan specified");
+  }
+
+  private void createConnection() throws IOException {
+    if (connection == null) {
+      logger.debug(
+          "Creating new connection to hostname:{} port:{}",
+          hostname, port);
+      connection = new IRCConnection(hostname, new int[] { port },
+          password, nick, user, name);
+      connection.addIRCEventListener(new IRCConnectionListener());
+      connection.setEncoding("UTF-8");
+      connection.setPong(true);
+      connection.setDaemon(false);
+      connection.setColors(false);
+      connection.connect();
+      connection.send("join " + IRC_CHANNEL_PREFIX + chan);
+    }
+  }
+
+  private void destroyConnection() {
+    if (connection != null) {
+      logger.debug("Destroying connection to: {}:{}", hostname, port);
+      connection.close();
+    }
+
+    connection = null;
+  }
+
+  @Override
+  public void start() {
+    logger.info("IRC sink starting");
+
+    try {
+      createConnection();
+    } catch (Exception e) {
+      logger.error("Unable to create irc client using hostname:"
+          + hostname + " port:" + port + ". Exception follows.", e);
+
+      /* Try to prevent leaking resources. */
+      destroyConnection();
+
+      /* FIXME: Mark ourselves as failed. */
+      return;
+    }
+
+    super.start();
+
+    logger.debug("IRC sink {} started", this.getName());
+  }
+
+  @Override
+  public void stop() {
+    logger.info("IRC sink {} stopping", this.getName());
+
+    destroyConnection();
+
+    super.stop();
+
+    logger.debug("IRC sink {} stopped. Metrics:{}", this.getName(), counterGroup);
+  }
+
+  private void sendLine(Event event) {
+    String body = new String(event.getBody());
+    
+    if (splitLines) {
+      String[] lines = body.split(splitChars);
+      for (String line: lines) {
+        connection.doPrivmsg(IRC_CHANNEL_PREFIX + this.chan, line);
+      }
+    } else {
+      connection.doPrivmsg(IRC_CHANNEL_PREFIX + this.chan, body);
+    }
+    
+  }
+  
+  @Override
+  public Status process() throws EventDeliveryException {
+    Status status = Status.READY;
+    Channel channel = getChannel();
+    Transaction transaction = channel.getTransaction();
+
+    try {
+      transaction.begin();
+      createConnection();
+
+      Event event = channel.take();
+
+      if (event == null) {
+        counterGroup.incrementAndGet("event.empty");
+        status = Status.BACKOFF;
+      } else {
+        sendLine(event);
+        counterGroup.incrementAndGet("event.irc");
+      }
+
+      transaction.commit();
+
+    } catch (ChannelException e) {
+      transaction.rollback();
+      logger.error(
+          "Unable to get event from channel. Exception follows.", e);
+      status = Status.BACKOFF;
+    } catch (Exception e) {
+      transaction.rollback();
+      logger.error(
+          "Unable to communicate with IRC server. Exception follows.",
+          e);
+      status = Status.BACKOFF;
+      destroyConnection();
+    } finally {
+      transaction.close();
+    }
+
+    return status;
+  }
+}
diff --git a/code/flume-ng-sinks/flume-irc-sink/src/test/java/org/apache/flume/sink/irc/TestIRCSink.java b/code/flume-ng-sinks/flume-irc-sink/src/test/java/org/apache/flume/sink/irc/TestIRCSink.java
new file mode 100644
index 0000000..32517d1
--- /dev/null
+++ b/code/flume-ng-sinks/flume-irc-sink/src/test/java/org/apache/flume/sink/irc/TestIRCSink.java
@@ -0,0 +1,166 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flume.sink.irc;
+
+import org.apache.commons.io.FileUtils;
+import org.apache.commons.io.IOUtils;
+import org.apache.flume.Channel;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.EventDeliveryException;
+import org.apache.flume.Sink;
+import org.apache.flume.Transaction;
+import org.apache.flume.channel.MemoryChannel;
+import org.apache.flume.conf.Configurables;
+import org.apache.flume.event.EventBuilder;
+import org.junit.After;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+
+import java.io.File;
+import java.io.FileOutputStream;
+import java.io.IOException;
+import java.net.ServerSocket;
+import java.net.Socket;
+import java.util.List;
+import java.util.UUID;
+
+import static org.junit.Assert.fail;
+
+public class TestIRCSink {
+
+  private File eventFile;
+  int ircServerPort;
+  DumbIRCServer dumbIRCServer;
+  @Rule
+  public TemporaryFolder folder = new TemporaryFolder();
+
+  private static int findFreePort() throws IOException {
+    ServerSocket socket = new ServerSocket(0);
+    int port = socket.getLocalPort();
+    socket.close();
+    return port;
+  }
+
+  @Before
+  public void setUp() throws IOException {
+    ircServerPort = findFreePort();
+    dumbIRCServer = new DumbIRCServer(ircServerPort);
+    dumbIRCServer.start();
+    eventFile = folder.newFile("eventFile.txt");
+  }
+
+  @After
+  public void tearDown() throws Exception {
+    dumbIRCServer.shutdownServer();
+  }
+
+  @Test
+  public void testIRCSinkMissingSplitLineProperty() {
+    Sink ircSink = new IRCSink();
+    ircSink.setName("IRC Sink - " + UUID.randomUUID().toString());
+    Context context = new Context();
+    context.put("hostname", "localhost");
+    context.put("port", String.valueOf(ircServerPort));
+    context.put("nick", "flume");
+    context.put("password", "flume");
+    context.put("user", "flume");
+    context.put("name", "flume-dev");
+    context.put("chan", "flume");
+    context.put("splitchars", "false");
+    Configurables.configure(ircSink, context);
+    Channel memoryChannel = new MemoryChannel();
+    Configurables.configure(memoryChannel, context);
+    ircSink.setChannel(memoryChannel);
+    ircSink.start();
+    Transaction txn = memoryChannel.getTransaction();
+    txn.begin();
+    Event event = EventBuilder.withBody("Dummy Event".getBytes());
+    memoryChannel.put(event);
+    txn.commit();
+    txn.close();
+    try {
+      Sink.Status status = ircSink.process();
+      if (status == Sink.Status.BACKOFF) {
+        fail("Error occured");
+      }
+    } catch (EventDeliveryException eDelExcp) {
+      // noop
+    }
+  }
+
+  class DumbIRCServer extends Thread {
+    int port;
+    ServerSocket ss;
+
+    public DumbIRCServer(int port) {
+      this.port = port;
+    }
+
+    public void run() {
+      try {
+        ss = new ServerSocket(port);
+        while (true) {
+          try {
+            Socket socket = ss.accept();
+            process(socket);
+          } catch (Exception ex) {
+            /* noop */
+          }
+        }
+      } catch (IOException e) {
+        // noop
+      }
+    }
+
+    public void shutdownServer() throws Exception {
+      ss.close();
+    }
+
+    /**
+     * Process the incoming request from IRC client
+     *
+     * @param socket  IRC client connection socket
+     * @throws IOException
+     */
+    private void process(Socket socket) throws IOException {
+      FileOutputStream fileOutputStream = FileUtils.openOutputStream(eventFile);
+      List<String> input = IOUtils.readLines(socket.getInputStream());
+      for (String next : input) {
+        if (isPrivMessage(next)) {
+          fileOutputStream.write(next.getBytes());
+          fileOutputStream.write("\n".getBytes());
+        }
+      }
+      fileOutputStream.close();
+      socket.close();
+    }
+
+    /**
+     * Checks if the message is Priv message
+     *
+     * @param input command received from IRC client
+     * @return true, if command received is PrivMessage
+     */
+    private boolean isPrivMessage(String input) {
+      return input.startsWith("PRIVMSG");
+    }
+  }
+}
\ No newline at end of file
diff --git a/code/flume-ng-sinks/flume-ng-elasticsearch-sink/pom.xml b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/pom.xml
new file mode 100644
index 0000000..527bcca
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/pom.xml
@@ -0,0 +1,93 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!-- Licensed to the Apache Software Foundation (ASF) under one or more contributor
+  license agreements. See the NOTICE file distributed with this work for additional
+  information regarding copyright ownership. The ASF licenses this file to
+  You under the Apache License, Version 2.0 (the "License"); you may not use
+  this file except in compliance with the License. You may obtain a copy of
+  the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required
+  by applicable law or agreed to in writing, software distributed under the
+  License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS
+  OF ANY KIND, either express or implied. See the License for the specific
+  language governing permissions and limitations under the License. -->
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
+
+  <modelVersion>4.0.0</modelVersion>
+
+ <parent>
+    <artifactId>flume-ng-sinks</artifactId>
+    <groupId>org.apache.flume</groupId>
+    <version>1.7.0</version>
+  </parent>
+
+  <groupId>org.apache.flume.flume-ng-sinks</groupId>
+  <artifactId>flume-ng-elasticsearch-sink</artifactId>
+  <name>Flume NG ElasticSearch Sink</name>
+
+  <build>
+    <plugins>
+      <plugin>
+        <groupId>org.apache.rat</groupId>
+        <artifactId>apache-rat-plugin</artifactId>
+      </plugin>
+    </plugins>
+  </build>
+
+  <dependencies>
+
+    <dependency>
+      <groupId>org.apache.flume</groupId>
+      <artifactId>flume-ng-sdk</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.flume</groupId>
+      <artifactId>flume-ng-core</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>org.slf4j</groupId>
+      <artifactId>slf4j-api</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>org.elasticsearch</groupId>
+      <artifactId>elasticsearch</artifactId>
+      <optional>true</optional>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.httpcomponents</groupId>
+      <artifactId>httpclient</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>junit</groupId>
+      <artifactId>junit</artifactId>
+      <scope>test</scope>
+    </dependency>
+
+    <dependency>
+      <groupId>org.slf4j</groupId>
+      <artifactId>slf4j-log4j12</artifactId>
+      <scope>test</scope>
+    </dependency>
+
+    <dependency>
+      <groupId>commons-lang</groupId>
+      <artifactId>commons-lang</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>com.google.guava</groupId>
+      <artifactId>guava</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>org.mockito</groupId>
+      <artifactId>mockito-all</artifactId>
+      <scope>test</scope>
+    </dependency>
+
+  </dependencies>
+</project>
diff --git a/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/AbstractElasticSearchIndexRequestBuilderFactory.java b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/AbstractElasticSearchIndexRequestBuilderFactory.java
new file mode 100644
index 0000000..754155c
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/AbstractElasticSearchIndexRequestBuilderFactory.java
@@ -0,0 +1,124 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.elasticsearch;
+
+import java.io.IOException;
+
+import org.apache.commons.lang.time.FastDateFormat;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.conf.ComponentConfiguration;
+import org.apache.flume.conf.Configurable;
+import org.apache.flume.conf.ConfigurableComponent;
+import org.apache.flume.formatter.output.BucketPath;
+import org.elasticsearch.action.index.IndexRequestBuilder;
+import org.elasticsearch.client.Client;
+
+import com.google.common.annotations.VisibleForTesting;
+
+/**
+ * Abstract base class for custom implementations of
+ * {@link ElasticSearchIndexRequestBuilderFactory}.
+ */
+public abstract class AbstractElasticSearchIndexRequestBuilderFactory
+    implements ElasticSearchIndexRequestBuilderFactory {
+
+  /**
+   * {@link FastDateFormat} to use for index names
+   *   in {@link #getIndexName(String, long)}
+   */
+  protected final FastDateFormat fastDateFormat;
+
+  /**
+   * Constructor for subclasses
+   * @param fastDateFormat {@link FastDateFormat} to use for index names
+   */
+  protected AbstractElasticSearchIndexRequestBuilderFactory(FastDateFormat fastDateFormat) {
+    this.fastDateFormat = fastDateFormat;
+  }
+
+  /**
+   * @see Configurable
+   */
+  @Override
+  public abstract void configure(Context arg0);
+
+  /**
+   * @see ConfigurableComponent
+   */
+  @Override
+  public abstract void configure(ComponentConfiguration arg0);
+
+  /**
+   * Creates and prepares an {@link IndexRequestBuilder} from the supplied
+   * {@link Client} via delegation to the subclass-hook template methods
+   * {@link #getIndexName(String, long)} and
+   * {@link #prepareIndexRequest(IndexRequestBuilder, String, String, Event)}
+   */
+  @Override
+  public IndexRequestBuilder createIndexRequest(Client client,
+        String indexPrefix, String indexType, Event event) throws IOException {
+    IndexRequestBuilder request = prepareIndex(client);
+    String realIndexPrefix = BucketPath.escapeString(indexPrefix, event.getHeaders());
+    String realIndexType = BucketPath.escapeString(indexType, event.getHeaders());
+
+    TimestampedEvent timestampedEvent = new TimestampedEvent(event);
+    long timestamp = timestampedEvent.getTimestamp();
+
+    String indexName = getIndexName(realIndexPrefix, timestamp);
+    prepareIndexRequest(request, indexName, realIndexType, timestampedEvent);
+    return request;
+  }
+
+  @VisibleForTesting
+  IndexRequestBuilder prepareIndex(Client client) {
+    return client.prepareIndex();
+  }
+
+  /**
+   * Gets the name of the index to use for an index request
+   * @param indexPrefix
+   *          Prefix of index name to use -- as configured on the sink
+   * @param timestamp
+   *          timestamp (millis) to format / use
+   * @return index name of the form 'indexPrefix-formattedTimestamp'
+   */
+  protected String getIndexName(String indexPrefix, long timestamp) {
+    return new StringBuilder(indexPrefix).append('-')
+      .append(fastDateFormat.format(timestamp)).toString();
+  }
+
+  /**
+   * Prepares an ElasticSearch {@link IndexRequestBuilder} instance
+   * @param indexRequest
+   *          The (empty) ElasticSearch {@link IndexRequestBuilder} to prepare
+   * @param indexName
+   *          Index name to use -- as per {@link #getIndexName(String, long)}
+   * @param indexType
+   *          Index type to use -- as configured on the sink
+   * @param event
+   *          Flume event to serialize and add to index request
+   * @throws IOException
+   *           If an error occurs e.g. during serialization
+  */
+  protected abstract void prepareIndexRequest(
+      IndexRequestBuilder indexRequest, String indexName,
+      String indexType, Event event) throws IOException;
+
+}
\ No newline at end of file
diff --git a/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ContentBuilderUtil.java b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ContentBuilderUtil.java
new file mode 100644
index 0000000..83c3ffd
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ContentBuilderUtil.java
@@ -0,0 +1,87 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.elasticsearch;
+import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder;
+
+import java.io.IOException;
+import java.nio.charset.Charset;
+
+import org.elasticsearch.common.jackson.core.JsonParseException;
+import org.elasticsearch.common.xcontent.XContentBuilder;
+import org.elasticsearch.common.xcontent.XContentFactory;
+import org.elasticsearch.common.xcontent.XContentParser;
+import org.elasticsearch.common.xcontent.XContentType;
+
+/**
+ * Utility methods for using ElasticSearch {@link XContentBuilder}
+ */
+public class ContentBuilderUtil {
+
+  private static final Charset charset = Charset.defaultCharset();
+
+  private ContentBuilderUtil() {
+  }
+
+  public static void appendField(XContentBuilder builder, String field,
+      byte[] data) throws IOException {
+    XContentType contentType = XContentFactory.xContentType(data);
+    if (contentType == null) {
+      addSimpleField(builder, field, data);
+    } else {
+      addComplexField(builder, field, contentType, data);
+    }
+  }
+
+  public static void addSimpleField(XContentBuilder builder, String fieldName,
+      byte[] data) throws IOException {
+    builder.field(fieldName, new String(data, charset));
+  }
+
+  public static void addComplexField(XContentBuilder builder, String fieldName,
+      XContentType contentType, byte[] data) throws IOException {
+    XContentParser parser = null;
+    try {
+      // Elasticsearch will accept JSON directly but we need to validate that
+      // the incoming event is JSON first. Sadly, the elasticsearch JSON parser
+      // is a stream parser so we need to instantiate it, parse the event to
+      // validate it, then instantiate it again to provide the JSON to
+      // elasticsearch.
+      // If validation fails then the incoming event is submitted to
+      // elasticsearch as plain text.
+      parser = XContentFactory.xContent(contentType).createParser(data);
+      while (parser.nextToken() != null) {};
+
+      // If the JSON is valid then include it
+      parser = XContentFactory.xContent(contentType).createParser(data);
+      // Add the field name, but not the value.
+      builder.field(fieldName);
+      // This will add the whole parsed content as the value of the field.
+      builder.copyCurrentStructure(parser);
+    } catch (JsonParseException ex) {
+      // If we get an exception here the most likely cause is nested JSON that
+      // can't be figured out in the body. At this point just push it through
+      // as is
+      addSimpleField(builder, fieldName, data);
+    } finally {
+      if (parser != null) {
+        parser.close();
+      }
+    }
+  }
+}
diff --git a/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ElasticSearchDynamicSerializer.java b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ElasticSearchDynamicSerializer.java
new file mode 100644
index 0000000..aa7ad39
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ElasticSearchDynamicSerializer.java
@@ -0,0 +1,73 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.elasticsearch;
+
+import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder;
+
+import java.io.IOException;
+import java.util.Map;
+
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.conf.ComponentConfiguration;
+import org.elasticsearch.common.xcontent.XContentBuilder;
+
+/**
+ * Basic serializer that serializes the event body and header fields into
+ * individual fields</p>
+ *
+ * A best effort will be used to determine the content-type, if it cannot be
+ * determined fields will be indexed as Strings
+ */
+public class ElasticSearchDynamicSerializer implements
+    ElasticSearchEventSerializer {
+
+  @Override
+  public void configure(Context context) {
+    // NO-OP...
+  }
+
+  @Override
+  public void configure(ComponentConfiguration conf) {
+    // NO-OP...
+  }
+
+  @Override
+  public XContentBuilder getContentBuilder(Event event) throws IOException {
+    XContentBuilder builder = jsonBuilder().startObject();
+    appendBody(builder, event);
+    appendHeaders(builder, event);
+    return builder;
+  }
+
+  private void appendBody(XContentBuilder builder, Event event)
+      throws IOException {
+    ContentBuilderUtil.appendField(builder, "body", event.getBody());
+  }
+
+  private void appendHeaders(XContentBuilder builder, Event event)
+      throws IOException {
+    Map<String, String> headers = event.getHeaders();
+    for (String key : headers.keySet()) {
+      ContentBuilderUtil.appendField(builder, key,
+          headers.get(key).getBytes(charset));
+    }
+  }
+
+}
diff --git a/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ElasticSearchEventSerializer.java b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ElasticSearchEventSerializer.java
new file mode 100644
index 0000000..c89d627
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ElasticSearchEventSerializer.java
@@ -0,0 +1,48 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.elasticsearch;
+
+import java.io.IOException;
+import java.nio.charset.Charset;
+
+import org.apache.flume.Event;
+import org.apache.flume.conf.Configurable;
+import org.apache.flume.conf.ConfigurableComponent;
+import org.elasticsearch.common.io.BytesStream;
+
+/**
+ * Interface for an event serializer which serializes the headers and body of an
+ * event to write them to ElasticSearch. This is configurable, so any config
+ * params required should be taken through this.
+ */
+public interface ElasticSearchEventSerializer extends Configurable,
+    ConfigurableComponent {
+
+  public static final Charset charset = Charset.defaultCharset();
+
+  /**
+   * Return an {@link BytesStream} made up of the serialized flume event
+   * @param event
+   *          The flume event to serialize
+   * @return A {@link BytesStream} used to write to ElasticSearch
+   * @throws IOException
+   *           If an error occurs during serialization
+   */
+  abstract BytesStream getContentBuilder(Event event) throws IOException;
+}
diff --git a/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ElasticSearchIndexRequestBuilderFactory.java b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ElasticSearchIndexRequestBuilderFactory.java
new file mode 100644
index 0000000..f76308c
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ElasticSearchIndexRequestBuilderFactory.java
@@ -0,0 +1,60 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.elasticsearch;
+
+import org.apache.commons.lang.time.FastDateFormat;
+import org.apache.flume.Event;
+import org.apache.flume.conf.Configurable;
+import org.apache.flume.conf.ConfigurableComponent;
+import org.elasticsearch.action.index.IndexRequestBuilder;
+import org.elasticsearch.client.Client;
+
+import java.io.IOException;
+import java.util.TimeZone;
+
+/**
+ * Interface for creating ElasticSearch {@link IndexRequestBuilder} instances
+ * from serialized flume events. This is configurable, so any config params
+ * required should be taken through this.
+ */
+public interface ElasticSearchIndexRequestBuilderFactory extends Configurable,
+    ConfigurableComponent {
+
+  static final FastDateFormat df = FastDateFormat.getInstance("yyyy-MM-dd",
+      TimeZone.getTimeZone("Etc/UTC"));
+
+  /**
+   * @param client
+   *          ElasticSearch {@link Client} to prepare index from
+   * @param indexPrefix
+   *          Prefix of index name to use -- as configured on the sink
+   * @param indexType
+   *          Index type to use -- as configured on the sink
+   * @param event
+   *          Flume event to serialize and add to index request
+   * @return prepared ElasticSearch {@link IndexRequestBuilder} instance
+   * @throws IOException
+   *           If an error occurs e.g. during serialization
+   */
+  IndexRequestBuilder createIndexRequest(Client client, String indexPrefix,
+      String indexType, Event event) throws IOException;
+
+
+
+}
diff --git a/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ElasticSearchLogStashEventSerializer.java b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ElasticSearchLogStashEventSerializer.java
new file mode 100644
index 0000000..3638368
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ElasticSearchLogStashEventSerializer.java
@@ -0,0 +1,145 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.elasticsearch;
+
+import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder;
+
+import java.io.IOException;
+import java.io.UnsupportedEncodingException;
+import java.util.Date;
+import java.util.Map;
+
+import org.apache.commons.lang.StringUtils;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.conf.ComponentConfiguration;
+import org.elasticsearch.common.collect.Maps;
+import org.elasticsearch.common.xcontent.XContentBuilder;
+
+/**
+ * Serialize flume events into the same format LogStash uses</p>
+ *
+ * This can be used to send events to ElasticSearch and use clients such as
+ * Kabana which expect Logstash formated indexes
+ *
+ * <pre>
+ * {
+ *    "@timestamp": "2010-12-21T21:48:33.309258Z",
+ *    "@tags": [ "array", "of", "tags" ],
+ *    "@type": "string",
+ *    "@source": "source of the event, usually a URL."
+ *    "@source_host": ""
+ *    "@source_path": ""
+ *    "@fields":{
+ *       # a set of fields for this event
+ *       "user": "jordan",
+ *       "command": "shutdown -r":
+ *     }
+ *     "@message": "the original plain-text message"
+ *   }
+ * </pre>
+ *
+ * If the following headers are present, they will map to the above logstash
+ * output as long as the logstash fields are not already present.</p>
+ *
+ * <pre>
+ *  timestamp: long -> @timestamp:Date
+ *  host: String -> @source_host: String
+ *  src_path: String -> @source_path: String
+ *  type: String -> @type: String
+ *  source: String -> @source: String
+ * </pre>
+ *
+ * @see https
+ *      ://github.com/logstash/logstash/wiki/logstash%27s-internal-message-
+ *      format
+ */
+public class ElasticSearchLogStashEventSerializer implements
+    ElasticSearchEventSerializer {
+
+  @Override
+  public XContentBuilder getContentBuilder(Event event) throws IOException {
+    XContentBuilder builder = jsonBuilder().startObject();
+    appendBody(builder, event);
+    appendHeaders(builder, event);
+    return builder;
+  }
+
+  private void appendBody(XContentBuilder builder, Event event)
+      throws IOException, UnsupportedEncodingException {
+    byte[] body = event.getBody();
+    ContentBuilderUtil.appendField(builder, "@message", body);
+  }
+
+  private void appendHeaders(XContentBuilder builder, Event event)
+      throws IOException {
+    Map<String, String> headers = Maps.newHashMap(event.getHeaders());
+
+    String timestamp = headers.get("timestamp");
+    if (!StringUtils.isBlank(timestamp)
+        && StringUtils.isBlank(headers.get("@timestamp"))) {
+      long timestampMs = Long.parseLong(timestamp);
+      builder.field("@timestamp", new Date(timestampMs));
+    }
+
+    String source = headers.get("source");
+    if (!StringUtils.isBlank(source)
+        && StringUtils.isBlank(headers.get("@source"))) {
+      ContentBuilderUtil.appendField(builder, "@source",
+          source.getBytes(charset));
+    }
+
+    String type = headers.get("type");
+    if (!StringUtils.isBlank(type)
+        && StringUtils.isBlank(headers.get("@type"))) {
+      ContentBuilderUtil.appendField(builder, "@type", type.getBytes(charset));
+    }
+
+    String host = headers.get("host");
+    if (!StringUtils.isBlank(host)
+        && StringUtils.isBlank(headers.get("@source_host"))) {
+      ContentBuilderUtil.appendField(builder, "@source_host",
+          host.getBytes(charset));
+    }
+
+    String srcPath = headers.get("src_path");
+    if (!StringUtils.isBlank(srcPath)
+        && StringUtils.isBlank(headers.get("@source_path"))) {
+      ContentBuilderUtil.appendField(builder, "@source_path",
+          srcPath.getBytes(charset));
+    }
+
+    builder.startObject("@fields");
+    for (String key : headers.keySet()) {
+      byte[] val = headers.get(key).getBytes(charset);
+      ContentBuilderUtil.appendField(builder, key, val);
+    }
+    builder.endObject();
+  }
+
+  @Override
+  public void configure(Context context) {
+    // NO-OP...
+  }
+
+  @Override
+  public void configure(ComponentConfiguration conf) {
+    // NO-OP...
+  }
+}
diff --git a/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ElasticSearchSink.java b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ElasticSearchSink.java
new file mode 100644
index 0000000..ebafb9f
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ElasticSearchSink.java
@@ -0,0 +1,428 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.elasticsearch;
+
+import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.BATCH_SIZE;
+import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.CLUSTER_NAME;
+import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.DEFAULT_CLUSTER_NAME;
+import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.DEFAULT_INDEX_NAME;
+import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.DEFAULT_INDEX_TYPE;
+import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.DEFAULT_TTL;
+import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.HOSTNAMES;
+import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.INDEX_NAME;
+import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.INDEX_TYPE;
+import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.SERIALIZER;
+import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.SERIALIZER_PREFIX;
+import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.TTL;
+import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.TTL_REGEX;
+import org.apache.commons.lang.StringUtils;
+import org.apache.flume.Channel;
+import org.apache.flume.Context;
+import org.apache.flume.CounterGroup;
+import org.apache.flume.Event;
+import org.apache.flume.EventDeliveryException;
+import org.apache.flume.Transaction;
+import org.apache.flume.formatter.output.BucketPath;
+import org.apache.flume.conf.Configurable;
+import org.apache.flume.instrumentation.SinkCounter;
+import org.apache.flume.sink.AbstractSink;
+import org.apache.flume.sink.elasticsearch.client.ElasticSearchClient;
+import org.apache.flume.sink.elasticsearch.client.ElasticSearchClientFactory;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import com.google.common.annotations.VisibleForTesting;
+import com.google.common.base.Preconditions;
+import com.google.common.base.Throwables;
+
+import java.util.concurrent.TimeUnit;
+import java.util.regex.Matcher;
+import java.util.regex.Pattern;
+
+import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.CLIENT_PREFIX;
+import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.CLIENT_TYPE;
+import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.DEFAULT_CLIENT_TYPE;
+import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.DEFAULT_INDEX_NAME_BUILDER_CLASS;
+import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.DEFAULT_SERIALIZER_CLASS;
+import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.INDEX_NAME_BUILDER;
+import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.INDEX_NAME_BUILDER_PREFIX;
+
+/**
+ * A sink which reads events from a channel and writes them to ElasticSearch
+ * based on the work done by https://github.com/Aconex/elasticflume.git.</p>
+ * 
+ * This sink supports batch reading of events from the channel and writing them
+ * to ElasticSearch.</p>
+ * 
+ * Indexes will be rolled daily using the format 'indexname-YYYY-MM-dd' to allow
+ * easier management of the index</p>
+ * 
+ * This sink must be configured with with mandatory parameters detailed in
+ * {@link ElasticSearchSinkConstants}</p> It is recommended as a secondary step
+ * the ElasticSearch indexes are optimized for the specified serializer. This is
+ * not handled by the sink but is typically done by deploying a config template
+ * alongside the ElasticSearch deploy</p>
+ * 
+ * @see http
+ *      ://www.elasticsearch.org/guide/reference/api/admin-indices-templates.
+ *      html
+ */
+public class ElasticSearchSink extends AbstractSink implements Configurable {
+
+  private static final Logger logger = LoggerFactory
+      .getLogger(ElasticSearchSink.class);
+
+  // Used for testing
+  private boolean isLocal = false;
+  private final CounterGroup counterGroup = new CounterGroup();
+
+  private static final int defaultBatchSize = 100;
+
+  private int batchSize = defaultBatchSize;
+  private long ttlMs = DEFAULT_TTL;
+  private String clusterName = DEFAULT_CLUSTER_NAME;
+  private String indexName = DEFAULT_INDEX_NAME;
+  private String indexType = DEFAULT_INDEX_TYPE;
+  private String clientType = DEFAULT_CLIENT_TYPE;
+  private final Pattern pattern = Pattern.compile(TTL_REGEX,
+      Pattern.CASE_INSENSITIVE);
+  private Matcher matcher = pattern.matcher("");
+
+  private String[] serverAddresses = null;
+
+  private ElasticSearchClient client = null;
+  private Context elasticSearchClientContext = null;
+
+  private ElasticSearchIndexRequestBuilderFactory indexRequestFactory;
+  private ElasticSearchEventSerializer eventSerializer;
+  private IndexNameBuilder indexNameBuilder;
+  private SinkCounter sinkCounter;
+
+  /**
+   * Create an {@link ElasticSearchSink} configured using the supplied
+   * configuration
+   */
+  public ElasticSearchSink() {
+    this(false);
+  }
+
+  /**
+   * Create an {@link ElasticSearchSink}</p>
+   * 
+   * @param isLocal
+   *          If <tt>true</tt> sink will be configured to only talk to an
+   *          ElasticSearch instance hosted in the same JVM, should always be
+   *          false is production
+   * 
+   */
+  @VisibleForTesting
+  ElasticSearchSink(boolean isLocal) {
+    this.isLocal = isLocal;
+  }
+
+  @VisibleForTesting
+  String[] getServerAddresses() {
+    return serverAddresses;
+  }
+
+  @VisibleForTesting
+  String getClusterName() {
+    return clusterName;
+  }
+
+  @VisibleForTesting
+  String getIndexName() {
+    return indexName;
+  }
+
+  @VisibleForTesting
+  String getIndexType() {
+    return indexType;
+  }
+
+  @VisibleForTesting
+  long getTTLMs() {
+    return ttlMs;
+  }
+
+  @VisibleForTesting
+  ElasticSearchEventSerializer getEventSerializer() {
+    return eventSerializer;
+  }
+
+  @VisibleForTesting
+  IndexNameBuilder getIndexNameBuilder() {
+    return indexNameBuilder;
+  }
+
+  @Override
+  public Status process() throws EventDeliveryException {
+    logger.debug("processing...");
+    Status status = Status.READY;
+    Channel channel = getChannel();
+    Transaction txn = channel.getTransaction();
+    try {
+      txn.begin();
+      int count;
+      for (count = 0; count < batchSize; ++count) {
+        Event event = channel.take();
+
+        if (event == null) {
+          break;
+        }
+        String realIndexType = BucketPath.escapeString(indexType, event.getHeaders());
+        client.addEvent(event, indexNameBuilder, realIndexType, ttlMs);
+      }
+
+      if (count <= 0) {
+        sinkCounter.incrementBatchEmptyCount();
+        counterGroup.incrementAndGet("channel.underflow");
+        status = Status.BACKOFF;
+      } else {
+        if (count < batchSize) {
+          sinkCounter.incrementBatchUnderflowCount();
+          status = Status.BACKOFF;
+        } else {
+          sinkCounter.incrementBatchCompleteCount();
+        }
+
+        sinkCounter.addToEventDrainAttemptCount(count);
+        client.execute();
+      }
+      txn.commit();
+      sinkCounter.addToEventDrainSuccessCount(count);
+      counterGroup.incrementAndGet("transaction.success");
+    } catch (Throwable ex) {
+      try {
+        txn.rollback();
+        counterGroup.incrementAndGet("transaction.rollback");
+      } catch (Exception ex2) {
+        logger.error(
+            "Exception in rollback. Rollback might not have been successful.",
+            ex2);
+      }
+
+      if (ex instanceof Error || ex instanceof RuntimeException) {
+        logger.error("Failed to commit transaction. Transaction rolled back.",
+            ex);
+        Throwables.propagate(ex);
+      } else {
+        logger.error("Failed to commit transaction. Transaction rolled back.",
+            ex);
+        throw new EventDeliveryException(
+            "Failed to commit transaction. Transaction rolled back.", ex);
+      }
+    } finally {
+      txn.close();
+    }
+    return status;
+  }
+
+  @Override
+  public void configure(Context context) {
+    if (!isLocal) {
+      if (StringUtils.isNotBlank(context.getString(HOSTNAMES))) {
+        serverAddresses = StringUtils.deleteWhitespace(
+            context.getString(HOSTNAMES)).split(",");
+      }
+      Preconditions.checkState(serverAddresses != null
+          && serverAddresses.length > 0, "Missing Param:" + HOSTNAMES);
+    }
+
+    if (StringUtils.isNotBlank(context.getString(INDEX_NAME))) {
+      this.indexName = context.getString(INDEX_NAME);
+    }
+
+    if (StringUtils.isNotBlank(context.getString(INDEX_TYPE))) {
+      this.indexType = context.getString(INDEX_TYPE);
+    }
+
+    if (StringUtils.isNotBlank(context.getString(CLUSTER_NAME))) {
+      this.clusterName = context.getString(CLUSTER_NAME);
+    }
+
+    if (StringUtils.isNotBlank(context.getString(BATCH_SIZE))) {
+      this.batchSize = Integer.parseInt(context.getString(BATCH_SIZE));
+    }
+
+    if (StringUtils.isNotBlank(context.getString(TTL))) {
+      this.ttlMs = parseTTL(context.getString(TTL));
+      Preconditions.checkState(ttlMs > 0, TTL
+          + " must be greater than 0 or not set.");
+    }
+
+    if (StringUtils.isNotBlank(context.getString(CLIENT_TYPE))) {
+      clientType = context.getString(CLIENT_TYPE);
+    }
+
+    elasticSearchClientContext = new Context();
+    elasticSearchClientContext.putAll(context.getSubProperties(CLIENT_PREFIX));
+
+    String serializerClazz = DEFAULT_SERIALIZER_CLASS;
+    if (StringUtils.isNotBlank(context.getString(SERIALIZER))) {
+      serializerClazz = context.getString(SERIALIZER);
+    }
+
+    Context serializerContext = new Context();
+    serializerContext.putAll(context.getSubProperties(SERIALIZER_PREFIX));
+
+    try {
+      @SuppressWarnings("unchecked")
+      Class<? extends Configurable> clazz = (Class<? extends Configurable>) Class
+          .forName(serializerClazz);
+      Configurable serializer = clazz.newInstance();
+
+      if (serializer instanceof ElasticSearchIndexRequestBuilderFactory) {
+        indexRequestFactory
+            = (ElasticSearchIndexRequestBuilderFactory) serializer;
+        indexRequestFactory.configure(serializerContext);
+      } else if (serializer instanceof ElasticSearchEventSerializer) {
+        eventSerializer = (ElasticSearchEventSerializer) serializer;
+        eventSerializer.configure(serializerContext);
+      } else {
+        throw new IllegalArgumentException(serializerClazz
+            + " is not an ElasticSearchEventSerializer");
+      }
+    } catch (Exception e) {
+      logger.error("Could not instantiate event serializer.", e);
+      Throwables.propagate(e);
+    }
+
+    if (sinkCounter == null) {
+      sinkCounter = new SinkCounter(getName());
+    }
+
+    String indexNameBuilderClass = DEFAULT_INDEX_NAME_BUILDER_CLASS;
+    if (StringUtils.isNotBlank(context.getString(INDEX_NAME_BUILDER))) {
+      indexNameBuilderClass = context.getString(INDEX_NAME_BUILDER);
+    }
+
+    Context indexnameBuilderContext = new Context();
+    serializerContext.putAll(
+            context.getSubProperties(INDEX_NAME_BUILDER_PREFIX));
+
+    try {
+      @SuppressWarnings("unchecked")
+      Class<? extends IndexNameBuilder> clazz
+              = (Class<? extends IndexNameBuilder>) Class
+              .forName(indexNameBuilderClass);
+      indexNameBuilder = clazz.newInstance();
+      indexnameBuilderContext.put(INDEX_NAME, indexName);
+      indexNameBuilder.configure(indexnameBuilderContext);
+    } catch (Exception e) {
+      logger.error("Could not instantiate index name builder.", e);
+      Throwables.propagate(e);
+    }
+
+    if (sinkCounter == null) {
+      sinkCounter = new SinkCounter(getName());
+    }
+
+    Preconditions.checkState(StringUtils.isNotBlank(indexName),
+        "Missing Param:" + INDEX_NAME);
+    Preconditions.checkState(StringUtils.isNotBlank(indexType),
+        "Missing Param:" + INDEX_TYPE);
+    Preconditions.checkState(StringUtils.isNotBlank(clusterName),
+        "Missing Param:" + CLUSTER_NAME);
+    Preconditions.checkState(batchSize >= 1, BATCH_SIZE
+        + " must be greater than 0");
+  }
+
+  @Override
+  public void start() {
+    ElasticSearchClientFactory clientFactory = new ElasticSearchClientFactory();
+
+    logger.info("ElasticSearch sink {} started");
+    sinkCounter.start();
+    try {
+      if (isLocal) {
+        client = clientFactory.getLocalClient(
+            clientType, eventSerializer, indexRequestFactory);
+      } else {
+        client = clientFactory.getClient(clientType, serverAddresses,
+            clusterName, eventSerializer, indexRequestFactory);
+        client.configure(elasticSearchClientContext);
+      }
+      sinkCounter.incrementConnectionCreatedCount();
+    } catch (Exception ex) {
+      ex.printStackTrace();
+      sinkCounter.incrementConnectionFailedCount();
+      if (client != null) {
+        client.close();
+        sinkCounter.incrementConnectionClosedCount();
+      }
+    }
+
+    super.start();
+  }
+
+  @Override
+  public void stop() {
+    logger.info("ElasticSearch sink {} stopping");
+    if (client != null) {
+      client.close();
+    }
+    sinkCounter.incrementConnectionClosedCount();
+    sinkCounter.stop();
+    super.stop();
+  }
+
+  /*
+   * Returns TTL value of ElasticSearch index in milliseconds when TTL specifier
+   * is "ms" / "s" / "m" / "h" / "d" / "w". In case of unknown specifier TTL is
+   * not set. When specifier is not provided it defaults to days in milliseconds
+   * where the number of days is parsed integer from TTL string provided by
+   * user. <p> Elasticsearch supports ttl values being provided in the format:
+   * 1d / 1w / 1ms / 1s / 1h / 1m specify a time unit like d (days), m
+   * (minutes), h (hours), ms (milliseconds) or w (weeks), milliseconds is used
+   * as default unit.
+   * http://www.elasticsearch.org/guide/reference/mapping/ttl-field/.
+   * 
+   * @param ttl TTL value provided by user in flume configuration file for the
+   * sink
+   * 
+   * @return the ttl value in milliseconds
+   */
+  private long parseTTL(String ttl) {
+    matcher = matcher.reset(ttl);
+    while (matcher.find()) {
+      if (matcher.group(2).equals("ms")) {
+        return Long.parseLong(matcher.group(1));
+      } else if (matcher.group(2).equals("s")) {
+        return TimeUnit.SECONDS.toMillis(Integer.parseInt(matcher.group(1)));
+      } else if (matcher.group(2).equals("m")) {
+        return TimeUnit.MINUTES.toMillis(Integer.parseInt(matcher.group(1)));
+      } else if (matcher.group(2).equals("h")) {
+        return TimeUnit.HOURS.toMillis(Integer.parseInt(matcher.group(1)));
+      } else if (matcher.group(2).equals("d")) {
+        return TimeUnit.DAYS.toMillis(Integer.parseInt(matcher.group(1)));
+      } else if (matcher.group(2).equals("w")) {
+        return TimeUnit.DAYS.toMillis(7 * Integer.parseInt(matcher.group(1)));
+      } else if (matcher.group(2).equals("")) {
+        logger.info("TTL qualifier is empty. Defaulting to day qualifier.");
+        return TimeUnit.DAYS.toMillis(Integer.parseInt(matcher.group(1)));
+      } else {
+        logger.debug("Unknown TTL qualifier provided. Setting TTL to 0.");
+        return 0;
+      }
+    }
+    logger.info("TTL not provided. Skipping the TTL config by returning 0.");
+    return 0;
+  }
+}
diff --git a/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ElasticSearchSinkConstants.java b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ElasticSearchSinkConstants.java
new file mode 100644
index 0000000..da88def
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ElasticSearchSinkConstants.java
@@ -0,0 +1,111 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.elasticsearch;
+
+public class ElasticSearchSinkConstants {
+
+  /**
+   * Comma separated list of hostname:port, if the port is not present the
+   * default port '9300' will be used</p>
+   * Example:
+   * <pre>
+   *  127.0.0.1:92001,127.0.0.2:9300
+   * </pre>
+   */
+  public static final String HOSTNAMES = "hostNames";
+
+  /**
+   * The name to index the document to, defaults to 'flume'</p>
+   * The current date in the format 'yyyy-MM-dd' will be appended to this name,
+   * for example 'foo' will result in a daily index of 'foo-yyyy-MM-dd'
+   */
+  public static final String INDEX_NAME = "indexName";
+
+  /**
+   * The type to index the document to, defaults to 'log'
+   */
+  public static final String INDEX_TYPE = "indexType";
+
+  /**
+   * Name of the ElasticSearch cluster to connect to
+   */
+  public static final String CLUSTER_NAME = "clusterName";
+
+  /**
+   * Maximum number of events the sink should take from the channel per
+   * transaction, if available. Defaults to 100
+   */
+  public static final String BATCH_SIZE = "batchSize";
+
+  /**
+   * TTL in days, when set will cause the expired documents to be deleted
+   * automatically, if not set documents will never be automatically deleted
+   */
+  public static final String TTL = "ttl";
+
+  /**
+   * The fully qualified class name of the serializer the sink should use.
+   */
+  public static final String SERIALIZER = "serializer";
+
+  /**
+   * Configuration to pass to the serializer.
+   */
+  public static final String SERIALIZER_PREFIX = SERIALIZER + ".";
+
+  /**
+   * The fully qualified class name of the index name builder the sink
+   * should use to determine name of index where the event should be sent.
+   */
+  public static final String INDEX_NAME_BUILDER = "indexNameBuilder";
+
+  /**
+   * The fully qualified class name of the index name builder the sink
+   * should use to determine name of index where the event should be sent.
+   */
+  public static final String INDEX_NAME_BUILDER_PREFIX
+          = INDEX_NAME_BUILDER + ".";
+
+  /**
+   * The client type used for sending bulks to ElasticSearch
+   */
+  public static final String CLIENT_TYPE = "client";
+
+  /**
+   * The client prefix to extract the configuration that will be passed to
+   * elasticsearch client.
+   */
+  public static final String CLIENT_PREFIX = CLIENT_TYPE + ".";
+
+  /**
+   * DEFAULTS USED BY THE SINK
+   */
+
+  public static final int DEFAULT_PORT = 9300;
+  public static final int DEFAULT_TTL = -1;
+  public static final String DEFAULT_INDEX_NAME = "flume";
+  public static final String DEFAULT_INDEX_TYPE = "log";
+  public static final String DEFAULT_CLUSTER_NAME = "elasticsearch";
+  public static final String DEFAULT_CLIENT_TYPE = "transport";
+  public static final String TTL_REGEX = "^(\\d+)(\\D*)";
+  public static final String DEFAULT_SERIALIZER_CLASS = "org.apache.flume." +
+          "sink.elasticsearch.ElasticSearchLogStashEventSerializer";
+  public static final String DEFAULT_INDEX_NAME_BUILDER_CLASS =
+          "org.apache.flume.sink.elasticsearch.TimeBasedIndexNameBuilder";
+}
diff --git a/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/EventSerializerIndexRequestBuilderFactory.java b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/EventSerializerIndexRequestBuilderFactory.java
new file mode 100644
index 0000000..d6cca50
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/EventSerializerIndexRequestBuilderFactory.java
@@ -0,0 +1,69 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.elasticsearch;
+
+import java.io.IOException;
+
+import org.apache.commons.lang.time.FastDateFormat;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.conf.ComponentConfiguration;
+import org.elasticsearch.action.index.IndexRequestBuilder;
+import org.elasticsearch.common.io.BytesStream;
+
+/**
+ * Default implementation of {@link ElasticSearchIndexRequestBuilderFactory}.
+ * It serializes flume events using the
+ * {@link ElasticSearchEventSerializer} instance configured on the sink.
+ */
+public class EventSerializerIndexRequestBuilderFactory
+    extends AbstractElasticSearchIndexRequestBuilderFactory {
+
+  protected final ElasticSearchEventSerializer serializer;
+
+  public EventSerializerIndexRequestBuilderFactory(
+      ElasticSearchEventSerializer serializer) {
+    this(serializer, ElasticSearchIndexRequestBuilderFactory.df);
+  }
+
+  protected EventSerializerIndexRequestBuilderFactory(
+      ElasticSearchEventSerializer serializer, FastDateFormat fdf) {
+    super(fdf);
+    this.serializer = serializer;
+  }
+
+  @Override
+  public void configure(Context context) {
+    serializer.configure(context);
+  }
+
+  @Override
+  public void configure(ComponentConfiguration config) {
+    serializer.configure(config);
+  }
+
+  @Override
+  protected void prepareIndexRequest(IndexRequestBuilder indexRequest,
+      String indexName, String indexType, Event event) throws IOException {
+    BytesStream contentBuilder = serializer.getContentBuilder(event);
+    indexRequest.setIndex(indexName)
+        .setType(indexType)
+        .setSource(contentBuilder.bytes());
+  }
+}
diff --git a/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/IndexNameBuilder.java b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/IndexNameBuilder.java
new file mode 100644
index 0000000..1dd4415
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/IndexNameBuilder.java
@@ -0,0 +1,42 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.elasticsearch;
+
+import org.apache.flume.Event;
+import org.apache.flume.conf.Configurable;
+import org.apache.flume.conf.ConfigurableComponent;
+
+public interface IndexNameBuilder extends Configurable,
+        ConfigurableComponent {
+  /**
+   * Gets the name of the index to use for an index request
+   * @param event
+   *          Event which determines index name
+   * @return index name of the form 'indexPrefix-indexDynamicName'
+   */
+  public String getIndexName(Event event);
+  
+  /**
+   * Gets the prefix of index to use for an index request.
+   * @param event
+   *          Event which determines index name
+   * @return Index prefix name
+   */
+  public String getIndexPrefix(Event event);
+}
diff --git a/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/SimpleIndexNameBuilder.java b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/SimpleIndexNameBuilder.java
new file mode 100644
index 0000000..801cac9
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/SimpleIndexNameBuilder.java
@@ -0,0 +1,46 @@
+/*
+ * Copyright 2014 Apache Software Foundation.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flume.sink.elasticsearch;
+
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.conf.ComponentConfiguration;
+import org.apache.flume.formatter.output.BucketPath;
+
+public class SimpleIndexNameBuilder implements IndexNameBuilder {
+
+  private String indexName;
+
+  @Override
+  public String getIndexName(Event event) {
+    return BucketPath.escapeString(indexName, event.getHeaders());
+  }
+
+  @Override
+  public String getIndexPrefix(Event event) {
+    return BucketPath.escapeString(indexName, event.getHeaders());
+  }
+
+  @Override
+  public void configure(Context context) {
+    indexName = context.getString(ElasticSearchSinkConstants.INDEX_NAME);
+  }
+
+  @Override
+  public void configure(ComponentConfiguration conf) {
+  }
+}
diff --git a/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/TimeBasedIndexNameBuilder.java b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/TimeBasedIndexNameBuilder.java
new file mode 100644
index 0000000..c651732
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/TimeBasedIndexNameBuilder.java
@@ -0,0 +1,92 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.elasticsearch;
+
+import com.google.common.annotations.VisibleForTesting;
+import org.apache.commons.lang.StringUtils;
+import org.apache.commons.lang.time.FastDateFormat;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.conf.ComponentConfiguration;
+import org.apache.flume.formatter.output.BucketPath;
+
+import java.util.TimeZone;
+
+/**
+ * Default index name builder. It prepares name of index using configured
+ * prefix and current timestamp. Default format of name is prefix-yyyy-MM-dd".
+ */
+public class TimeBasedIndexNameBuilder implements
+        IndexNameBuilder {
+
+  public static final String DATE_FORMAT = "dateFormat";
+  public static final String TIME_ZONE = "timeZone";
+
+  public static final String DEFAULT_DATE_FORMAT = "yyyy-MM-dd";
+  public static final String DEFAULT_TIME_ZONE = "Etc/UTC";
+
+  private FastDateFormat fastDateFormat = FastDateFormat.getInstance("yyyy-MM-dd",
+      TimeZone.getTimeZone("Etc/UTC"));
+
+  private String indexPrefix;
+
+  @VisibleForTesting
+  FastDateFormat getFastDateFormat() {
+    return fastDateFormat;
+  }
+
+  /**
+   * Gets the name of the index to use for an index request
+   * @param event
+   *          Event for which the name of index has to be prepared
+   * @return index name of the form 'indexPrefix-formattedTimestamp'
+   */
+  @Override
+  public String getIndexName(Event event) {
+    TimestampedEvent timestampedEvent = new TimestampedEvent(event);
+    long timestamp = timestampedEvent.getTimestamp();
+    String realIndexPrefix = BucketPath.escapeString(indexPrefix, event.getHeaders());
+    return new StringBuilder(realIndexPrefix).append('-')
+      .append(fastDateFormat.format(timestamp)).toString();
+  }
+  
+  @Override
+  public String getIndexPrefix(Event event) {
+    return BucketPath.escapeString(indexPrefix, event.getHeaders());
+  }
+
+  @Override
+  public void configure(Context context) {
+    String dateFormatString = context.getString(DATE_FORMAT);
+    String timeZoneString = context.getString(TIME_ZONE);
+    if (StringUtils.isBlank(dateFormatString)) {
+      dateFormatString = DEFAULT_DATE_FORMAT;
+    }
+    if (StringUtils.isBlank(timeZoneString)) {
+      timeZoneString = DEFAULT_TIME_ZONE;
+    }
+    fastDateFormat = FastDateFormat.getInstance(dateFormatString,
+        TimeZone.getTimeZone(timeZoneString));
+    indexPrefix = context.getString(ElasticSearchSinkConstants.INDEX_NAME);
+  }
+
+  @Override
+  public void configure(ComponentConfiguration conf) {
+  }
+}
diff --git a/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/TimestampedEvent.java b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/TimestampedEvent.java
new file mode 100644
index 0000000..c056839
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/TimestampedEvent.java
@@ -0,0 +1,60 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.elasticsearch;
+
+import com.google.common.collect.Maps;
+import org.apache.commons.lang.StringUtils;
+import org.apache.flume.Event;
+import org.apache.flume.event.SimpleEvent;
+import org.joda.time.DateTimeUtils;
+
+import java.util.Map;
+
+/**
+ * {@link org.apache.flume.Event} implementation that has a timestamp.
+ * The timestamp is taken from (in order of precedence):<ol>
+ * <li>The "timestamp" header of the base event, if present</li>
+ * <li>The "@timestamp" header of the base event, if present</li>
+ * <li>The current time in millis, otherwise</li>
+ * </ol>
+ */
+final class TimestampedEvent extends SimpleEvent {
+
+  private final long timestamp;
+
+  TimestampedEvent(Event base) {
+    setBody(base.getBody());
+    Map<String, String> headers = Maps.newHashMap(base.getHeaders());
+    String timestampString = headers.get("timestamp");
+    if (StringUtils.isBlank(timestampString)) {
+      timestampString = headers.get("@timestamp");
+    }
+    if (StringUtils.isBlank(timestampString)) {
+      this.timestamp = DateTimeUtils.currentTimeMillis();
+      headers.put("timestamp", String.valueOf(timestamp ));
+    } else {
+      this.timestamp = Long.valueOf(timestampString);
+    }
+    setHeaders(headers);
+  }
+
+  long getTimestamp() {
+    return timestamp;
+  }
+}
diff --git a/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/client/ElasticSearchClient.java b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/client/ElasticSearchClient.java
new file mode 100644
index 0000000..655e00a
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/client/ElasticSearchClient.java
@@ -0,0 +1,58 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.elasticsearch.client;
+
+import org.apache.flume.Event;
+import org.apache.flume.conf.Configurable;
+import org.apache.flume.sink.elasticsearch.IndexNameBuilder;
+
+/**
+ * Interface for an ElasticSearch client which is responsible for sending bulks
+ * of events to ElasticSearch.
+ */
+public interface ElasticSearchClient extends Configurable {
+
+  /**
+   * Close connection to elastic search in client
+   */
+  void close();
+
+  /**
+   * Add new event to the bulk
+   *
+   * @param event
+   *    Flume Event
+   * @param indexNameBuilder
+   *    Index name builder which generates name of index to feed
+   * @param indexType
+   *    Name of type of document which will be sent to the elasticsearch cluster
+   * @param ttlMs
+   *    Time to live expressed in milliseconds. Value <= 0 is ignored
+   * @throws Exception
+   */
+  public void addEvent(Event event, IndexNameBuilder indexNameBuilder,
+      String indexType, long ttlMs) throws Exception;
+
+  /**
+   * Sends bulk to the elasticsearch cluster
+   *
+   * @throws Exception
+   */
+  void execute() throws Exception;
+}
diff --git a/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/client/ElasticSearchClientFactory.java b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/client/ElasticSearchClientFactory.java
new file mode 100644
index 0000000..986fb2b
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/client/ElasticSearchClientFactory.java
@@ -0,0 +1,78 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.elasticsearch.client;
+
+import org.apache.flume.sink.elasticsearch.ElasticSearchEventSerializer;
+import org.apache.flume.sink.elasticsearch.ElasticSearchIndexRequestBuilderFactory;
+
+/**
+ * Internal ElasticSearch client factory. Responsible for creating instance
+ * of ElasticSearch clients.
+ */
+public class ElasticSearchClientFactory {
+  public static final String TransportClient = "transport";
+  public static final String RestClient = "rest";
+
+  /**
+   *
+   * @param clientType
+   *    String representation of client type
+   * @param hostNames
+   *    Array of strings that represents hostnames with ports (hostname:port)
+   * @param clusterName
+   *    Elasticsearch cluster name used only by Transport Client
+   * @param serializer
+   *    Serializer of flume events to elasticsearch documents
+   * @return
+   */
+  public ElasticSearchClient getClient(String clientType, String[] hostNames,
+      String clusterName, ElasticSearchEventSerializer serializer,
+      ElasticSearchIndexRequestBuilderFactory indexBuilder) throws NoSuchClientTypeException {
+    if (clientType.equalsIgnoreCase(TransportClient) && serializer != null) {
+      return new ElasticSearchTransportClient(hostNames, clusterName, serializer);
+    } else if (clientType.equalsIgnoreCase(TransportClient) && indexBuilder != null) { 
+      return new ElasticSearchTransportClient(hostNames, clusterName, indexBuilder);
+    } else if (clientType.equalsIgnoreCase(RestClient) && serializer != null) {
+      return new ElasticSearchRestClient(hostNames, serializer);
+    }
+    throw new NoSuchClientTypeException();
+  }
+
+  /**
+   * Used for tests only. Creates local elasticsearch instance client.
+   *
+   * @param clientType Name of client to use
+   * @param serializer Serializer for the event
+   * @param indexBuilder Index builder factory
+   *
+   * @return Local elastic search instance client
+   */
+  public ElasticSearchClient getLocalClient(String clientType,
+                                            ElasticSearchEventSerializer serializer,
+                                            ElasticSearchIndexRequestBuilderFactory indexBuilder)
+      throws NoSuchClientTypeException {
+    if (clientType.equalsIgnoreCase(TransportClient) && serializer != null) {
+      return new ElasticSearchTransportClient(serializer);
+    } else if (clientType.equalsIgnoreCase(TransportClient) && indexBuilder != null)  {
+      return new ElasticSearchTransportClient(indexBuilder);
+    } else if (clientType.equalsIgnoreCase(RestClient)) {
+    }
+    throw new NoSuchClientTypeException();
+  }
+}
diff --git a/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/client/ElasticSearchRestClient.java b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/client/ElasticSearchRestClient.java
new file mode 100644
index 0000000..e51efe2
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/client/ElasticSearchRestClient.java
@@ -0,0 +1,149 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.elasticsearch.client;
+
+import com.google.common.annotations.VisibleForTesting;
+import com.google.gson.Gson;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.EventDeliveryException;
+import org.apache.flume.sink.elasticsearch.ElasticSearchEventSerializer;
+import org.apache.flume.sink.elasticsearch.IndexNameBuilder;
+import org.apache.http.HttpResponse;
+import org.apache.http.HttpStatus;
+import org.apache.http.client.HttpClient;
+import org.apache.http.client.methods.HttpPost;
+import org.apache.http.entity.StringEntity;
+import org.apache.http.impl.client.DefaultHttpClient;
+import org.apache.http.util.EntityUtils;
+import org.elasticsearch.common.bytes.BytesReference;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.Arrays;
+import java.util.HashMap;
+import java.util.Map;
+
+/**
+ * Rest ElasticSearch client which is responsible for sending bulks of events to
+ * ElasticSearch using ElasticSearch HTTP API. This is configurable, so any
+ * config params required should be taken through this.
+ */
+public class  ElasticSearchRestClient implements ElasticSearchClient {
+
+  private static final String INDEX_OPERATION_NAME = "index";
+  private static final String INDEX_PARAM = "_index";
+  private static final String TYPE_PARAM = "_type";
+  private static final String TTL_PARAM = "_ttl";
+  private static final String BULK_ENDPOINT = "_bulk";
+
+  private static final Logger logger = LoggerFactory.getLogger(ElasticSearchRestClient.class);
+
+  private final ElasticSearchEventSerializer serializer;
+  private final RoundRobinList<String> serversList;
+  
+  private StringBuilder bulkBuilder;
+  private HttpClient httpClient;
+  
+  public ElasticSearchRestClient(String[] hostNames,
+      ElasticSearchEventSerializer serializer) {
+
+    for (int i = 0; i < hostNames.length; ++i) {
+      if (!hostNames[i].contains("http://") && !hostNames[i].contains("https://")) {
+        hostNames[i] = "http://" + hostNames[i];
+      }
+    }
+    this.serializer = serializer;
+
+    serversList = new RoundRobinList<String>(Arrays.asList(hostNames));
+    httpClient = new DefaultHttpClient();
+    bulkBuilder = new StringBuilder();
+  }
+
+  @VisibleForTesting
+  public ElasticSearchRestClient(String[] hostNames,
+          ElasticSearchEventSerializer serializer, HttpClient client) {
+    this(hostNames, serializer);
+    httpClient = client;
+  }
+
+  @Override
+  public void configure(Context context) {
+  }
+
+  @Override
+  public void close() {
+  }
+
+  @Override
+  public void addEvent(Event event, IndexNameBuilder indexNameBuilder, String indexType,
+                       long ttlMs) throws Exception {
+    BytesReference content = serializer.getContentBuilder(event).bytes();
+    Map<String, Map<String, String>> parameters = new HashMap<String, Map<String, String>>();
+    Map<String, String> indexParameters = new HashMap<String, String>();
+    indexParameters.put(INDEX_PARAM, indexNameBuilder.getIndexName(event));
+    indexParameters.put(TYPE_PARAM, indexType);
+    if (ttlMs > 0) {
+      indexParameters.put(TTL_PARAM, Long.toString(ttlMs));
+    }
+    parameters.put(INDEX_OPERATION_NAME, indexParameters);
+
+    Gson gson = new Gson();
+    synchronized (bulkBuilder) {
+      bulkBuilder.append(gson.toJson(parameters));
+      bulkBuilder.append("\n");
+      bulkBuilder.append(content.toBytesArray().toUtf8());
+      bulkBuilder.append("\n");
+    }
+  }
+
+  @Override
+  public void execute() throws Exception {
+    int statusCode = 0, triesCount = 0;
+    HttpResponse response = null;
+    String entity;
+    synchronized (bulkBuilder) {
+      entity = bulkBuilder.toString();
+      bulkBuilder = new StringBuilder();
+    }
+
+    while (statusCode != HttpStatus.SC_OK && triesCount < serversList.size()) {
+      triesCount++;
+      String host = serversList.get();
+      String url = host + "/" + BULK_ENDPOINT;
+      HttpPost httpRequest = new HttpPost(url);
+      httpRequest.setEntity(new StringEntity(entity));
+      response = httpClient.execute(httpRequest);
+      statusCode = response.getStatusLine().getStatusCode();
+      logger.info("Status code from elasticsearch: " + statusCode);
+      if (response.getEntity() != null) {
+        logger.debug("Status message from elasticsearch: " +
+                     EntityUtils.toString(response.getEntity(), "UTF-8"));
+      }
+    }
+
+    if (statusCode != HttpStatus.SC_OK) {
+      if (response.getEntity() != null) {
+        throw new EventDeliveryException(EntityUtils.toString(response.getEntity(), "UTF-8"));
+      } else {
+        throw new EventDeliveryException("Elasticsearch status code was: " + statusCode);
+      }
+    }
+  }
+}
diff --git a/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/client/ElasticSearchTransportClient.java b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/client/ElasticSearchTransportClient.java
new file mode 100644
index 0000000..2cf365e
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/client/ElasticSearchTransportClient.java
@@ -0,0 +1,228 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.elasticsearch.client;
+
+import com.google.common.annotations.VisibleForTesting;
+import java.io.IOException;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.EventDeliveryException;
+import org.apache.flume.sink.elasticsearch.ElasticSearchEventSerializer;
+import org.apache.flume.sink.elasticsearch.IndexNameBuilder;
+import org.elasticsearch.action.bulk.BulkRequestBuilder;
+import org.elasticsearch.action.bulk.BulkResponse;
+import org.elasticsearch.action.index.IndexRequestBuilder;
+import org.elasticsearch.client.Client;
+import org.elasticsearch.client.transport.TransportClient;
+import org.elasticsearch.common.settings.ImmutableSettings;
+import org.elasticsearch.common.settings.Settings;
+import org.elasticsearch.common.transport.InetSocketTransportAddress;
+import org.elasticsearch.node.Node;
+import org.elasticsearch.node.NodeBuilder;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.Arrays;
+import org.apache.flume.sink.elasticsearch.ElasticSearchIndexRequestBuilderFactory;
+
+import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.DEFAULT_PORT;
+
+public class ElasticSearchTransportClient implements ElasticSearchClient {
+
+  public static final Logger logger = LoggerFactory
+      .getLogger(ElasticSearchTransportClient.class);
+
+  private InetSocketTransportAddress[] serverAddresses;
+  private ElasticSearchEventSerializer serializer;
+  private ElasticSearchIndexRequestBuilderFactory indexRequestBuilderFactory;
+  private BulkRequestBuilder bulkRequestBuilder;
+
+  private Client client;
+
+  @VisibleForTesting
+  InetSocketTransportAddress[] getServerAddresses() {
+    return serverAddresses;
+  }
+
+  @VisibleForTesting
+  void setBulkRequestBuilder(BulkRequestBuilder bulkRequestBuilder) {
+    this.bulkRequestBuilder = bulkRequestBuilder;
+  }
+
+  /**
+   * Transport client for external cluster
+   * 
+   * @param hostNames
+   * @param clusterName
+   * @param serializer
+   */
+  public ElasticSearchTransportClient(String[] hostNames, String clusterName,
+      ElasticSearchEventSerializer serializer) {
+    configureHostnames(hostNames);
+    this.serializer = serializer;
+    openClient(clusterName);
+  }
+
+  public ElasticSearchTransportClient(String[] hostNames, String clusterName,
+      ElasticSearchIndexRequestBuilderFactory indexBuilder) {
+    configureHostnames(hostNames);
+    this.indexRequestBuilderFactory = indexBuilder;
+    openClient(clusterName);
+  }
+  
+  /**
+   * Local transport client only for testing
+   * 
+   * @param indexBuilderFactory
+   */
+  public ElasticSearchTransportClient(ElasticSearchIndexRequestBuilderFactory indexBuilderFactory) {
+    this.indexRequestBuilderFactory = indexBuilderFactory;
+    openLocalDiscoveryClient();
+  }
+  
+  /**
+   * Local transport client only for testing
+   *
+   * @param serializer
+   */
+  public ElasticSearchTransportClient(ElasticSearchEventSerializer serializer) {
+    this.serializer = serializer;
+    openLocalDiscoveryClient();
+  }
+
+  /**
+   * Used for testing
+   *
+   * @param client
+   *    ElasticSearch Client
+   * @param serializer
+   *    Event Serializer
+   */
+  public ElasticSearchTransportClient(Client client,
+      ElasticSearchEventSerializer serializer) {
+    this.client = client;
+    this.serializer = serializer;
+  }
+
+  /**
+   * Used for testing
+   */
+  public ElasticSearchTransportClient(Client client,
+                                      ElasticSearchIndexRequestBuilderFactory requestBuilderFactory)
+      throws IOException {
+    this.client = client;
+    requestBuilderFactory.createIndexRequest(client, null, null, null);
+  }
+
+  private void configureHostnames(String[] hostNames) {
+    logger.warn(Arrays.toString(hostNames));
+    serverAddresses = new InetSocketTransportAddress[hostNames.length];
+    for (int i = 0; i < hostNames.length; i++) {
+      String[] hostPort = hostNames[i].trim().split(":");
+      String host = hostPort[0].trim();
+      int port = hostPort.length == 2 ? Integer.parseInt(hostPort[1].trim())
+              : DEFAULT_PORT;
+      serverAddresses[i] = new InetSocketTransportAddress(host, port);
+    }
+  }
+  
+  @Override
+  public void close() {
+    if (client != null) {
+      client.close();
+    }
+    client = null;
+  }
+
+  @Override
+  public void addEvent(Event event, IndexNameBuilder indexNameBuilder,
+      String indexType, long ttlMs) throws Exception {
+    if (bulkRequestBuilder == null) {
+      bulkRequestBuilder = client.prepareBulk();
+    }
+
+    IndexRequestBuilder indexRequestBuilder = null;
+    if (indexRequestBuilderFactory == null) {
+      indexRequestBuilder = client
+          .prepareIndex(indexNameBuilder.getIndexName(event), indexType)
+          .setSource(serializer.getContentBuilder(event).bytes());
+    } else {
+      indexRequestBuilder = indexRequestBuilderFactory.createIndexRequest(
+          client, indexNameBuilder.getIndexPrefix(event), indexType, event);
+    }
+
+    if (ttlMs > 0) {
+      indexRequestBuilder.setTTL(ttlMs);
+    }
+    bulkRequestBuilder.add(indexRequestBuilder);
+  }
+
+  @Override
+  public void execute() throws Exception {
+    try {
+      BulkResponse bulkResponse = bulkRequestBuilder.execute().actionGet();
+      if (bulkResponse.hasFailures()) {
+        throw new EventDeliveryException(bulkResponse.buildFailureMessage());
+      }
+    } finally {
+      bulkRequestBuilder = client.prepareBulk();
+    }
+  }
+
+  /**
+   * Open client to elaticsearch cluster
+   * 
+   * @param clusterName
+   */
+  private void openClient(String clusterName) {
+    logger.info("Using ElasticSearch hostnames: {} ",
+        Arrays.toString(serverAddresses));
+    Settings settings = ImmutableSettings.settingsBuilder()
+        .put("cluster.name", clusterName).build();
+
+    TransportClient transportClient = new TransportClient(settings);
+    for (InetSocketTransportAddress host : serverAddresses) {
+      transportClient.addTransportAddress(host);
+    }
+    if (client != null) {
+      client.close();
+    }
+    client = transportClient;
+  }
+
+  /*
+   * FOR TESTING ONLY...
+   * 
+   * Opens a local discovery node for talking to an elasticsearch server running
+   * in the same JVM
+   */
+  private void openLocalDiscoveryClient() {
+    logger.info("Using ElasticSearch AutoDiscovery mode");
+    Node node = NodeBuilder.nodeBuilder().client(true).local(true).node();
+    if (client != null) {
+      client.close();
+    }
+    client = node.client();
+  }
+
+  @Override
+  public void configure(Context context) {
+    //To change body of implemented methods use File | Settings | File Templates.
+  }
+}
diff --git a/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/client/NoSuchClientTypeException.java b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/client/NoSuchClientTypeException.java
new file mode 100644
index 0000000..41fbe0d
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/client/NoSuchClientTypeException.java
@@ -0,0 +1,23 @@
+/*
+ * Copyright 2014 Apache Software Foundation.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flume.sink.elasticsearch.client;
+
+/**
+ * Exception class
+ */
+class NoSuchClientTypeException extends Exception {
+}
diff --git a/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/client/RoundRobinList.java b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/client/RoundRobinList.java
new file mode 100644
index 0000000..4cbbe91
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/client/RoundRobinList.java
@@ -0,0 +1,44 @@
+package org.apache.flume.sink.elasticsearch.client;
+
+import java.util.Collection;
+import java.util.Iterator;
+
+/*
+ * Copyright 2014 Apache Software Foundation.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+public class RoundRobinList<T> {
+
+  private Iterator<T> iterator;
+  private final Collection<T> elements;
+
+  public RoundRobinList(Collection<T> elements) {
+    this.elements = elements;
+    iterator = this.elements.iterator();
+  }
+
+  public synchronized T get() {
+    if (iterator.hasNext()) {
+      return iterator.next();
+    } else {
+      iterator = elements.iterator();
+      return iterator.next();
+    }
+  }
+
+  public int size() {
+    return elements.size();
+  }
+}
diff --git a/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/AbstractElasticSearchSinkTest.java b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/AbstractElasticSearchSinkTest.java
new file mode 100644
index 0000000..9fbd747
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/AbstractElasticSearchSinkTest.java
@@ -0,0 +1,164 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.elasticsearch;
+
+import org.apache.flume.Channel;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.channel.MemoryChannel;
+import org.apache.flume.conf.Configurables;
+import org.elasticsearch.action.search.SearchResponse;
+import org.elasticsearch.client.Client;
+import org.elasticsearch.common.collect.Maps;
+import org.elasticsearch.common.settings.ImmutableSettings;
+import org.elasticsearch.common.settings.Settings;
+import org.elasticsearch.gateway.Gateway;
+import org.elasticsearch.index.query.QueryBuilder;
+import org.elasticsearch.index.query.QueryBuilders;
+import org.elasticsearch.node.Node;
+import org.elasticsearch.node.NodeBuilder;
+import org.elasticsearch.node.internal.InternalNode;
+import org.elasticsearch.search.SearchHit;
+import org.elasticsearch.search.SearchHits;
+import org.joda.time.DateTimeUtils;
+import org.junit.After;
+import org.junit.Before;
+
+import java.util.Arrays;
+import java.util.Comparator;
+import java.util.Map;
+
+import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.BATCH_SIZE;
+import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.CLUSTER_NAME;
+import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.INDEX_NAME;
+import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.INDEX_TYPE;
+import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.TTL;
+import static org.junit.Assert.assertEquals;
+
+public abstract class AbstractElasticSearchSinkTest {
+
+  static final String DEFAULT_INDEX_NAME = "flume";
+  static final String DEFAULT_INDEX_TYPE = "log";
+  static final String DEFAULT_CLUSTER_NAME = "elasticsearch";
+  static final long FIXED_TIME_MILLIS = 123456789L;
+
+  Node node;
+  Client client;
+  String timestampedIndexName;
+  Map<String, String> parameters;
+
+  void initDefaults() {
+    parameters = Maps.newHashMap();
+    parameters.put(INDEX_NAME, DEFAULT_INDEX_NAME);
+    parameters.put(INDEX_TYPE, DEFAULT_INDEX_TYPE);
+    parameters.put(CLUSTER_NAME, DEFAULT_CLUSTER_NAME);
+    parameters.put(BATCH_SIZE, "1");
+    parameters.put(TTL, "5");
+
+    timestampedIndexName = DEFAULT_INDEX_NAME + '-'
+        + ElasticSearchIndexRequestBuilderFactory.df.format(FIXED_TIME_MILLIS);
+  }
+
+  void createNodes() throws Exception {
+    Settings settings = ImmutableSettings
+        .settingsBuilder()
+        .put("number_of_shards", 1)
+        .put("number_of_replicas", 0)
+        .put("routing.hash.type", "simple")
+        .put("gateway.type", "none")
+        .put("path.data", "target/es-test")
+        .build();
+
+    node = NodeBuilder.nodeBuilder().settings(settings).local(true).node();
+    client = node.client();
+
+    client.admin().cluster().prepareHealth().setWaitForGreenStatus().execute()
+        .actionGet();
+  }
+
+  void shutdownNodes() throws Exception {
+    ((InternalNode) node).injector().getInstance(Gateway.class).reset();
+    client.close();
+    node.close();
+  }
+
+  @Before
+  public void setFixedJodaTime() {
+    DateTimeUtils.setCurrentMillisFixed(FIXED_TIME_MILLIS);
+  }
+
+  @After
+  public void resetJodaTime() {
+    DateTimeUtils.setCurrentMillisSystem();
+  }
+
+  Channel bindAndStartChannel(ElasticSearchSink fixture) {
+    // Configure the channel
+    Channel channel = new MemoryChannel();
+    Configurables.configure(channel, new Context());
+
+    // Wire them together
+    fixture.setChannel(channel);
+    fixture.start();
+    return channel;
+  }
+
+  void assertMatchAllQuery(int expectedHits, Event... events) {
+    assertSearch(expectedHits, performSearch(QueryBuilders.matchAllQuery()),
+        null, events);
+  }
+
+  void assertBodyQuery(int expectedHits, Event... events) {
+    // Perform Multi Field Match
+    assertSearch(expectedHits,
+        performSearch(QueryBuilders.fieldQuery("@message", "event")),
+        null, events);
+  }
+
+  SearchResponse performSearch(QueryBuilder query) {
+    return client.prepareSearch(timestampedIndexName)
+        .setTypes(DEFAULT_INDEX_TYPE).setQuery(query).execute().actionGet();
+  }
+
+  void assertSearch(int expectedHits, SearchResponse response, Map<String, Object> expectedBody,
+                    Event... events) {
+    SearchHits hitResponse = response.getHits();
+    assertEquals(expectedHits, hitResponse.getTotalHits());
+
+    SearchHit[] hits = hitResponse.getHits();
+    Arrays.sort(hits, new Comparator<SearchHit>() {
+      @Override
+      public int compare(SearchHit o1, SearchHit o2) {
+        return o1.getSourceAsString().compareTo(o2.getSourceAsString());
+      }
+    });
+
+    for (int i = 0; i < events.length; i++) {
+      Event event = events[i];
+      SearchHit hit = hits[i];
+      Map<String, Object> source = hit.getSource();
+      if (expectedBody == null) {
+        assertEquals(new String(event.getBody()), source.get("@message"));
+      } else {
+        assertEquals(expectedBody, source.get("@message"));
+      }
+    }
+  }
+
+}
diff --git a/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/TestElasticSearchDynamicSerializer.java b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/TestElasticSearchDynamicSerializer.java
new file mode 100644
index 0000000..d4e4654
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/TestElasticSearchDynamicSerializer.java
@@ -0,0 +1,64 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.elasticsearch;
+
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.event.EventBuilder;
+import org.elasticsearch.common.collect.Maps;
+import org.elasticsearch.common.xcontent.XContentBuilder;
+import org.junit.Test;
+
+import java.util.Map;
+
+import static org.apache.flume.sink.elasticsearch.ElasticSearchEventSerializer.charset;
+import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder;
+import static org.junit.Assert.assertEquals;
+
+public class TestElasticSearchDynamicSerializer {
+
+  @Test
+  public void testRoundTrip() throws Exception {
+    ElasticSearchDynamicSerializer fixture = new ElasticSearchDynamicSerializer();
+    Context context = new Context();
+    fixture.configure(context);
+
+    String message = "test body";
+    Map<String, String> headers = Maps.newHashMap();
+    headers.put("headerNameOne", "headerValueOne");
+    headers.put("headerNameTwo", "headerValueTwo");
+    headers.put("headerNameThree", "headerValueThree");
+    Event event = EventBuilder.withBody(message.getBytes(charset));
+    event.setHeaders(headers);
+
+    XContentBuilder expected = jsonBuilder().startObject();
+    expected.field("body", new String(message.getBytes(), charset));
+    for (String headerName : headers.keySet()) {
+      expected.field(headerName, new String(headers.get(headerName).getBytes(),
+          charset));
+    }
+    expected.endObject();
+
+    XContentBuilder actual = fixture.getContentBuilder(event);
+
+    assertEquals(new String(expected.bytes().array()), new String(actual
+        .bytes().array()));
+
+  }
+}
diff --git a/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/TestElasticSearchIndexRequestBuilderFactory.java b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/TestElasticSearchIndexRequestBuilderFactory.java
new file mode 100644
index 0000000..b62254e
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/TestElasticSearchIndexRequestBuilderFactory.java
@@ -0,0 +1,215 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.elasticsearch;
+
+import com.google.common.collect.Maps;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.conf.ComponentConfiguration;
+import org.apache.flume.conf.sink.SinkConfiguration;
+import org.apache.flume.event.SimpleEvent;
+import org.elasticsearch.action.index.IndexRequestBuilder;
+import org.elasticsearch.client.Client;
+import org.elasticsearch.common.io.BytesStream;
+import org.elasticsearch.common.io.FastByteArrayOutputStream;
+import org.junit.Before;
+import org.junit.Test;
+
+import java.io.IOException;
+import java.util.Map;
+
+import static org.junit.Assert.assertArrayEquals;
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertFalse;
+import static org.junit.Assert.assertNull;
+import static org.junit.Assert.assertTrue;
+
+public class TestElasticSearchIndexRequestBuilderFactory
+    extends AbstractElasticSearchSinkTest {
+
+  private static final Client FAKE_CLIENT = null;
+
+  private EventSerializerIndexRequestBuilderFactory factory;
+
+  private FakeEventSerializer serializer;
+
+  @Before
+  public void setupFactory() throws Exception {
+    serializer = new FakeEventSerializer();
+    factory = new EventSerializerIndexRequestBuilderFactory(serializer) {
+      @Override
+      IndexRequestBuilder prepareIndex(Client client) {
+        return new IndexRequestBuilder(FAKE_CLIENT);
+      }
+    };
+  }
+
+  @Test
+  public void shouldUseUtcAsBasisForDateFormat() {
+    assertEquals("Coordinated Universal Time",
+        factory.fastDateFormat.getTimeZone().getDisplayName());
+  }
+
+  @Test
+  public void indexNameShouldBePrefixDashFormattedTimestamp() {
+    long millis = 987654321L;
+    assertEquals("prefix-" + factory.fastDateFormat.format(millis),
+        factory.getIndexName("prefix", millis));
+  }
+
+  @Test
+  public void shouldEnsureTimestampHeaderPresentInTimestampedEvent() {
+    SimpleEvent base = new SimpleEvent();
+
+    TimestampedEvent timestampedEvent = new TimestampedEvent(base);
+    assertEquals(FIXED_TIME_MILLIS, timestampedEvent.getTimestamp());
+    assertEquals(String.valueOf(FIXED_TIME_MILLIS),
+        timestampedEvent.getHeaders().get("timestamp"));
+  }
+
+  @Test
+  public void shouldUseExistingTimestampHeaderInTimestampedEvent() {
+    SimpleEvent base = new SimpleEvent();
+    Map<String, String> headersWithTimestamp = Maps.newHashMap();
+    headersWithTimestamp.put("timestamp", "-321");
+    base.setHeaders(headersWithTimestamp );
+
+    TimestampedEvent timestampedEvent = new TimestampedEvent(base);
+    assertEquals(-321L, timestampedEvent.getTimestamp());
+    assertEquals("-321", timestampedEvent.getHeaders().get("timestamp"));
+  }
+
+  @Test
+  public void shouldUseExistingAtTimestampHeaderInTimestampedEvent() {
+    SimpleEvent base = new SimpleEvent();
+    Map<String, String> headersWithTimestamp = Maps.newHashMap();
+    headersWithTimestamp.put("@timestamp", "-999");
+    base.setHeaders(headersWithTimestamp );
+
+    TimestampedEvent timestampedEvent = new TimestampedEvent(base);
+    assertEquals(-999L, timestampedEvent.getTimestamp());
+    assertEquals("-999", timestampedEvent.getHeaders().get("@timestamp"));
+    assertNull(timestampedEvent.getHeaders().get("timestamp"));
+  }
+
+  @Test
+  public void shouldPreserveBodyAndNonTimestampHeadersInTimestampedEvent() {
+    SimpleEvent base = new SimpleEvent();
+    base.setBody(new byte[] {1,2,3,4});
+    Map<String, String> headersWithTimestamp = Maps.newHashMap();
+    headersWithTimestamp.put("foo", "bar");
+    base.setHeaders(headersWithTimestamp );
+
+    TimestampedEvent timestampedEvent = new TimestampedEvent(base);
+    assertEquals("bar", timestampedEvent.getHeaders().get("foo"));
+    assertArrayEquals(base.getBody(), timestampedEvent.getBody());
+  }
+
+  @Test
+  public void shouldSetIndexNameTypeAndSerializedEventIntoIndexRequest()
+      throws Exception {
+
+    String indexPrefix = "qwerty";
+    String indexType = "uiop";
+    Event event = new SimpleEvent();
+
+    IndexRequestBuilder indexRequestBuilder = factory.createIndexRequest(
+        FAKE_CLIENT, indexPrefix, indexType, event);
+
+    assertEquals(indexPrefix + '-'
+        + ElasticSearchIndexRequestBuilderFactory.df.format(FIXED_TIME_MILLIS),
+        indexRequestBuilder.request().index());
+    assertEquals(indexType, indexRequestBuilder.request().type());
+    assertArrayEquals(FakeEventSerializer.FAKE_BYTES,
+        indexRequestBuilder.request().source().array());
+  }
+
+  @Test
+  public void shouldSetIndexNameFromTimestampHeaderWhenPresent()
+      throws Exception {
+    String indexPrefix = "qwerty";
+    String indexType = "uiop";
+    Event event = new SimpleEvent();
+    event.getHeaders().put("timestamp", "1213141516");
+
+    IndexRequestBuilder indexRequestBuilder = factory.createIndexRequest(
+        null, indexPrefix, indexType, event);
+
+    assertEquals(indexPrefix + '-'
+        + ElasticSearchIndexRequestBuilderFactory.df.format(1213141516L),
+        indexRequestBuilder.request().index());
+  }
+
+  @Test
+  public void shouldSetIndexNameTypeFromHeaderWhenPresent()
+      throws Exception {
+    String indexPrefix = "%{index-name}";
+    String indexType = "%{index-type}";
+    String indexValue = "testing-index-name-from-headers";
+    String typeValue = "testing-index-type-from-headers";
+
+    Event event = new SimpleEvent();
+    event.getHeaders().put("index-name", indexValue);
+    event.getHeaders().put("index-type", typeValue);
+
+    IndexRequestBuilder indexRequestBuilder = factory.createIndexRequest(
+        null, indexPrefix, indexType, event);
+
+    assertEquals(indexValue + '-'
+        + ElasticSearchIndexRequestBuilderFactory.df.format(FIXED_TIME_MILLIS),
+        indexRequestBuilder.request().index());
+    assertEquals(typeValue, indexRequestBuilder.request().type());
+  }
+
+  @Test
+  public void shouldConfigureEventSerializer() throws Exception {
+    assertFalse(serializer.configuredWithContext);
+    factory.configure(new Context());
+    assertTrue(serializer.configuredWithContext);
+
+    assertFalse(serializer.configuredWithComponentConfiguration);
+    factory.configure(new SinkConfiguration("name"));
+    assertTrue(serializer.configuredWithComponentConfiguration);
+  }
+
+  static class FakeEventSerializer implements ElasticSearchEventSerializer {
+
+    static final byte[] FAKE_BYTES = new byte[]{9, 8, 7, 6};
+    boolean configuredWithContext;
+    boolean configuredWithComponentConfiguration;
+
+    @Override
+    public BytesStream getContentBuilder(Event event) throws IOException {
+      FastByteArrayOutputStream fbaos = new FastByteArrayOutputStream(4);
+      fbaos.write(FAKE_BYTES);
+      return fbaos;
+    }
+
+    @Override
+    public void configure(Context arg0) {
+      configuredWithContext = true;
+    }
+
+    @Override
+    public void configure(ComponentConfiguration arg0) {
+      configuredWithComponentConfiguration = true;
+    }
+  }
+
+}
diff --git a/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/TestElasticSearchLogStashEventSerializer.java b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/TestElasticSearchLogStashEventSerializer.java
new file mode 100644
index 0000000..65b4dab
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/TestElasticSearchLogStashEventSerializer.java
@@ -0,0 +1,127 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.elasticsearch;
+
+import com.google.gson.JsonParser;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.event.EventBuilder;
+import org.elasticsearch.common.collect.Maps;
+import org.elasticsearch.common.xcontent.XContentBuilder;
+import org.junit.Test;
+
+import java.util.Date;
+import java.util.Map;
+
+import static org.apache.flume.sink.elasticsearch.ElasticSearchEventSerializer.charset;
+import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder;
+import static org.junit.Assert.assertEquals;
+
+public class TestElasticSearchLogStashEventSerializer {
+
+  @Test
+  public void testRoundTrip() throws Exception {
+    ElasticSearchLogStashEventSerializer fixture = new ElasticSearchLogStashEventSerializer();
+    Context context = new Context();
+    fixture.configure(context);
+
+    String message = "test body";
+    Map<String, String> headers = Maps.newHashMap();
+    long timestamp = System.currentTimeMillis();
+    headers.put("timestamp", String.valueOf(timestamp));
+    headers.put("source", "flume_tail_src");
+    headers.put("host", "test@localhost");
+    headers.put("src_path", "/tmp/test");
+    headers.put("headerNameOne", "headerValueOne");
+    headers.put("headerNameTwo", "headerValueTwo");
+    headers.put("type", "sometype");
+    Event event = EventBuilder.withBody(message.getBytes(charset));
+    event.setHeaders(headers);
+
+    XContentBuilder expected = jsonBuilder().startObject();
+    expected.field("@message", new String(message.getBytes(), charset));
+    expected.field("@timestamp", new Date(timestamp));
+    expected.field("@source", "flume_tail_src");
+    expected.field("@type", "sometype");
+    expected.field("@source_host", "test@localhost");
+    expected.field("@source_path", "/tmp/test");
+
+    expected.startObject("@fields");
+    expected.field("timestamp", String.valueOf(timestamp));
+    expected.field("src_path", "/tmp/test");
+    expected.field("host", "test@localhost");
+    expected.field("headerNameTwo", "headerValueTwo");
+    expected.field("source", "flume_tail_src");
+    expected.field("headerNameOne", "headerValueOne");
+    expected.field("type", "sometype");
+    expected.endObject();
+
+    expected.endObject();
+
+    XContentBuilder actual = fixture.getContentBuilder(event);
+    
+    JsonParser parser = new JsonParser();
+    assertEquals(parser.parse(expected.string()),parser.parse(actual.string()));
+  }
+
+  @Test
+  public void shouldHandleInvalidJSONDuringComplexParsing() throws Exception {
+    ElasticSearchLogStashEventSerializer fixture = new ElasticSearchLogStashEventSerializer();
+    Context context = new Context();
+    fixture.configure(context);
+
+    String message = "{flume: somethingnotvalid}";
+    Map<String, String> headers = Maps.newHashMap();
+    long timestamp = System.currentTimeMillis();
+    headers.put("timestamp", String.valueOf(timestamp));
+    headers.put("source", "flume_tail_src");
+    headers.put("host", "test@localhost");
+    headers.put("src_path", "/tmp/test");
+    headers.put("headerNameOne", "headerValueOne");
+    headers.put("headerNameTwo", "headerValueTwo");
+    headers.put("type", "sometype");
+    Event event = EventBuilder.withBody(message.getBytes(charset));
+    event.setHeaders(headers);
+
+    XContentBuilder expected = jsonBuilder().startObject();
+    expected.field("@message", new String(message.getBytes(), charset));
+    expected.field("@timestamp", new Date(timestamp));
+    expected.field("@source", "flume_tail_src");
+    expected.field("@type", "sometype");
+    expected.field("@source_host", "test@localhost");
+    expected.field("@source_path", "/tmp/test");
+
+    expected.startObject("@fields");
+    expected.field("timestamp", String.valueOf(timestamp));
+    expected.field("src_path", "/tmp/test");
+    expected.field("host", "test@localhost");
+    expected.field("headerNameTwo", "headerValueTwo");
+    expected.field("source", "flume_tail_src");
+    expected.field("headerNameOne", "headerValueOne");
+    expected.field("type", "sometype");
+    expected.endObject();
+
+    expected.endObject();
+
+    XContentBuilder actual = fixture.getContentBuilder(event);
+
+    JsonParser parser = new JsonParser();
+    assertEquals(parser.parse(expected.string()),parser.parse(actual.string()));
+  }
+}
diff --git a/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/TestElasticSearchSink.java b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/TestElasticSearchSink.java
new file mode 100644
index 0000000..69acc06
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/TestElasticSearchSink.java
@@ -0,0 +1,505 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.elasticsearch;
+
+import org.apache.commons.lang.time.FastDateFormat;
+import org.apache.flume.Channel;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.Sink.Status;
+import org.apache.flume.Transaction;
+import org.apache.flume.conf.ComponentConfiguration;
+import org.apache.flume.conf.Configurable;
+import org.apache.flume.conf.Configurables;
+import org.apache.flume.event.EventBuilder;
+import org.elasticsearch.action.index.IndexRequestBuilder;
+import org.elasticsearch.client.Requests;
+import org.elasticsearch.common.UUID;
+import org.elasticsearch.common.io.BytesStream;
+import org.elasticsearch.common.io.FastByteArrayOutputStream;
+import org.elasticsearch.index.query.QueryBuilders;
+import org.junit.After;
+import org.junit.Before;
+import org.junit.Test;
+
+import java.io.IOException;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.TimeZone;
+import java.util.concurrent.TimeUnit;
+
+import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.BATCH_SIZE;
+import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.CLUSTER_NAME;
+import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.HOSTNAMES;
+import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.INDEX_NAME;
+import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.INDEX_TYPE;
+import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.SERIALIZER;
+import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.TTL;
+import static org.junit.Assert.assertArrayEquals;
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertNull;
+import static org.junit.Assert.assertTrue;
+
+public class TestElasticSearchSink extends AbstractElasticSearchSinkTest {
+
+  private ElasticSearchSink fixture;
+
+  @Before
+  public void init() throws Exception {
+    initDefaults();
+    createNodes();
+    fixture = new ElasticSearchSink(true);
+    fixture.setName("ElasticSearchSink-" + UUID.randomUUID().toString());
+  }
+
+  @After
+  public void tearDown() throws Exception {
+    shutdownNodes();
+  }
+
+  @Test
+  public void shouldIndexOneEvent() throws Exception {
+    Configurables.configure(fixture, new Context(parameters));
+    Channel channel = bindAndStartChannel(fixture);
+
+    Transaction tx = channel.getTransaction();
+    tx.begin();
+    Event event = EventBuilder.withBody("event #1 or 1".getBytes());
+    channel.put(event);
+    tx.commit();
+    tx.close();
+
+    fixture.process();
+    fixture.stop();
+    client.admin().indices()
+        .refresh(Requests.refreshRequest(timestampedIndexName)).actionGet();
+
+    assertMatchAllQuery(1, event);
+    assertBodyQuery(1, event);
+  }
+
+  @Test
+  public void shouldIndexInvalidComplexJsonBody() throws Exception {
+    parameters.put(BATCH_SIZE, "3");
+    Configurables.configure(fixture, new Context(parameters));
+    Channel channel = bindAndStartChannel(fixture);
+
+    Transaction tx = channel.getTransaction();
+    tx.begin();
+    Event event1 = EventBuilder.withBody("TEST1 {test}".getBytes());
+    channel.put(event1);
+    Event event2 = EventBuilder.withBody("{test: TEST2 }".getBytes());
+    channel.put(event2);
+    Event event3 = EventBuilder.withBody("{\"test\":{ TEST3 {test} }}".getBytes());
+    channel.put(event3);
+    tx.commit();
+    tx.close();
+
+    fixture.process();
+    fixture.stop();
+    client.admin().indices()
+        .refresh(Requests.refreshRequest(timestampedIndexName)).actionGet();
+
+    assertMatchAllQuery(3);
+    assertSearch(1,
+        performSearch(QueryBuilders.fieldQuery("@message", "TEST1")),
+        null, event1);
+    assertSearch(1,
+        performSearch(QueryBuilders.fieldQuery("@message", "TEST2")),
+        null, event2);
+    assertSearch(1,
+        performSearch(QueryBuilders.fieldQuery("@message", "TEST3")),
+        null, event3);
+  }
+
+  @Test
+  public void shouldIndexComplexJsonEvent() throws Exception {
+    Configurables.configure(fixture, new Context(parameters));
+    Channel channel = bindAndStartChannel(fixture);
+
+    Transaction tx = channel.getTransaction();
+    tx.begin();
+    Event event = EventBuilder.withBody(
+        "{\"event\":\"json content\",\"num\":1}".getBytes());
+    channel.put(event);
+    tx.commit();
+    tx.close();
+
+    fixture.process();
+    fixture.stop();
+    client.admin().indices()
+            .refresh(Requests.refreshRequest(timestampedIndexName)).actionGet();
+
+    Map<String, Object> expectedBody = new HashMap<String, Object>();
+    expectedBody.put("event", "json content");
+    expectedBody.put("num", 1);
+
+    assertSearch(1,
+        performSearch(QueryBuilders.matchAllQuery()), expectedBody, event);
+    assertSearch(1,
+        performSearch(QueryBuilders.fieldQuery("@message.event", "json")),
+        expectedBody, event);
+  }
+
+  @Test
+  public void shouldIndexFiveEvents() throws Exception {
+    // Make it so we only need to call process once
+    parameters.put(BATCH_SIZE, "5");
+    Configurables.configure(fixture, new Context(parameters));
+    Channel channel = bindAndStartChannel(fixture);
+
+    int numberOfEvents = 5;
+    Event[] events = new Event[numberOfEvents];
+
+    Transaction tx = channel.getTransaction();
+    tx.begin();
+    for (int i = 0; i < numberOfEvents; i++) {
+      String body = "event #" + i + " of " + numberOfEvents;
+      Event event = EventBuilder.withBody(body.getBytes());
+      events[i] = event;
+      channel.put(event);
+    }
+    tx.commit();
+    tx.close();
+
+    fixture.process();
+    fixture.stop();
+    client.admin().indices()
+        .refresh(Requests.refreshRequest(timestampedIndexName)).actionGet();
+
+    assertMatchAllQuery(numberOfEvents, events);
+    assertBodyQuery(5, events);
+  }
+
+  @Test
+  public void shouldIndexFiveEventsOverThreeBatches() throws Exception {
+    parameters.put(BATCH_SIZE, "2");
+    Configurables.configure(fixture, new Context(parameters));
+    Channel channel = bindAndStartChannel(fixture);
+
+    int numberOfEvents = 5;
+    Event[] events = new Event[numberOfEvents];
+
+    Transaction tx = channel.getTransaction();
+    tx.begin();
+    for (int i = 0; i < numberOfEvents; i++) {
+      String body = "event #" + i + " of " + numberOfEvents;
+      Event event = EventBuilder.withBody(body.getBytes());
+      events[i] = event;
+      channel.put(event);
+    }
+    tx.commit();
+    tx.close();
+
+    int count = 0;
+    Status status = Status.READY;
+    while (status != Status.BACKOFF) {
+      count++;
+      status = fixture.process();
+    }
+    fixture.stop();
+
+    assertEquals(3, count);
+
+    client.admin().indices()
+        .refresh(Requests.refreshRequest(timestampedIndexName)).actionGet();
+    assertMatchAllQuery(numberOfEvents, events);
+    assertBodyQuery(5, events);
+  }
+
+  @Test
+  public void shouldParseConfiguration() {
+    parameters.put(HOSTNAMES, "10.5.5.27");
+    parameters.put(CLUSTER_NAME, "testing-cluster-name");
+    parameters.put(INDEX_NAME, "testing-index-name");
+    parameters.put(INDEX_TYPE, "testing-index-type");
+    parameters.put(TTL, "10");
+
+    fixture = new ElasticSearchSink();
+    fixture.configure(new Context(parameters));
+
+    String[] expected = { "10.5.5.27" };
+
+    assertEquals("testing-cluster-name", fixture.getClusterName());
+    assertEquals("testing-index-name", fixture.getIndexName());
+    assertEquals("testing-index-type", fixture.getIndexType());
+    assertEquals(TimeUnit.DAYS.toMillis(10), fixture.getTTLMs());
+    assertArrayEquals(expected, fixture.getServerAddresses());
+  }
+
+  @Test
+  public void shouldParseConfigurationUsingDefaults() {
+    parameters.put(HOSTNAMES, "10.5.5.27");
+    parameters.remove(INDEX_NAME);
+    parameters.remove(INDEX_TYPE);
+    parameters.remove(CLUSTER_NAME);
+
+    fixture = new ElasticSearchSink();
+    fixture.configure(new Context(parameters));
+
+    String[] expected = { "10.5.5.27" };
+
+    assertEquals(DEFAULT_INDEX_NAME, fixture.getIndexName());
+    assertEquals(DEFAULT_INDEX_TYPE, fixture.getIndexType());
+    assertEquals(DEFAULT_CLUSTER_NAME, fixture.getClusterName());
+    assertArrayEquals(expected, fixture.getServerAddresses());
+  }
+
+  @Test
+  public void shouldParseMultipleHostUsingDefaultPorts() {
+    parameters.put(HOSTNAMES, "10.5.5.27,10.5.5.28,10.5.5.29");
+
+    fixture = new ElasticSearchSink();
+    fixture.configure(new Context(parameters));
+
+    String[] expected = { "10.5.5.27", "10.5.5.28", "10.5.5.29" };
+
+    assertArrayEquals(expected, fixture.getServerAddresses());
+  }
+
+  @Test
+  public void shouldParseMultipleHostWithWhitespacesUsingDefaultPorts() {
+    parameters.put(HOSTNAMES, " 10.5.5.27 , 10.5.5.28 , 10.5.5.29 ");
+
+    fixture = new ElasticSearchSink();
+    fixture.configure(new Context(parameters));
+
+    String[] expected = { "10.5.5.27", "10.5.5.28", "10.5.5.29" };
+
+    assertArrayEquals(expected, fixture.getServerAddresses());
+  }
+
+  @Test
+  public void shouldParseMultipleHostAndPorts() {
+    parameters.put(HOSTNAMES, "10.5.5.27:9300,10.5.5.28:9301,10.5.5.29:9302");
+
+    fixture = new ElasticSearchSink();
+    fixture.configure(new Context(parameters));
+
+    String[] expected = { "10.5.5.27:9300", "10.5.5.28:9301", "10.5.5.29:9302" };
+
+    assertArrayEquals(expected, fixture.getServerAddresses());
+  }
+
+  @Test
+  public void shouldParseMultipleHostAndPortsWithWhitespaces() {
+    parameters.put(HOSTNAMES,
+        " 10.5.5.27 : 9300 , 10.5.5.28 : 9301 , 10.5.5.29 : 9302 ");
+
+    fixture = new ElasticSearchSink();
+    fixture.configure(new Context(parameters));
+
+    String[] expected = { "10.5.5.27:9300", "10.5.5.28:9301", "10.5.5.29:9302" };
+
+    assertArrayEquals(expected, fixture.getServerAddresses());
+  }
+
+  @Test
+  public void shouldAllowCustomElasticSearchIndexRequestBuilderFactory()
+      throws Exception {
+    parameters.put(SERIALIZER,
+        CustomElasticSearchIndexRequestBuilderFactory.class.getName());
+
+    fixture.configure(new Context(parameters));
+
+    Channel channel = bindAndStartChannel(fixture);
+    Transaction tx = channel.getTransaction();
+    tx.begin();
+    String body = "{ foo: \"bar\" }";
+    Event event = EventBuilder.withBody(body.getBytes());
+    channel.put(event);
+    tx.commit();
+    tx.close();
+
+    fixture.process();
+    fixture.stop();
+
+    assertEquals(fixture.getIndexName() + "-05_17_36_789",
+        CustomElasticSearchIndexRequestBuilderFactory.actualIndexName);
+    assertEquals(fixture.getIndexType(),
+        CustomElasticSearchIndexRequestBuilderFactory.actualIndexType);
+    assertArrayEquals(event.getBody(),
+        CustomElasticSearchIndexRequestBuilderFactory.actualEventBody);
+    assertTrue(CustomElasticSearchIndexRequestBuilderFactory.hasContext);
+  }
+
+  @Test
+  public void shouldParseFullyQualifiedTTLs() {
+    Map<String, Long> testTTLMap = new HashMap<String, Long>();
+    testTTLMap.put("1ms", Long.valueOf(1));
+    testTTLMap.put("1s", Long.valueOf(1000));
+    testTTLMap.put("1m", Long.valueOf(60000));
+    testTTLMap.put("1h", Long.valueOf(3600000));
+    testTTLMap.put("1d", Long.valueOf(86400000));
+    testTTLMap.put("1w", Long.valueOf(604800000));
+    testTTLMap.put("1", Long.valueOf(86400000));
+
+    parameters.put(HOSTNAMES, "10.5.5.27");
+    parameters.put(CLUSTER_NAME, "testing-cluster-name");
+    parameters.put(INDEX_NAME, "testing-index-name");
+    parameters.put(INDEX_TYPE, "testing-index-type");
+
+    for (String ttl : testTTLMap.keySet()) {
+      parameters.put(TTL, ttl);
+      fixture = new ElasticSearchSink();
+      fixture.configure(new Context(parameters));
+
+      String[] expected = { "10.5.5.27" };
+      assertEquals("testing-cluster-name", fixture.getClusterName());
+      assertEquals("testing-index-name", fixture.getIndexName());
+      assertEquals("testing-index-type", fixture.getIndexType());
+      assertEquals((long) testTTLMap.get(ttl), fixture.getTTLMs());
+      assertArrayEquals(expected, fixture.getServerAddresses());
+
+    }
+  }
+
+  public static final class CustomElasticSearchIndexRequestBuilderFactory
+      extends AbstractElasticSearchIndexRequestBuilderFactory {
+
+    static String actualIndexName;
+    static String actualIndexType;
+    static byte[] actualEventBody;
+    static boolean hasContext;
+
+    public CustomElasticSearchIndexRequestBuilderFactory() {
+      super(FastDateFormat.getInstance("HH_mm_ss_SSS", TimeZone.getTimeZone("EST5EDT")));
+    }
+
+    @Override
+    protected void prepareIndexRequest(IndexRequestBuilder indexRequest, String indexName,
+                                       String indexType, Event event) throws IOException {
+      actualIndexName = indexName;
+      actualIndexType = indexType;
+      actualEventBody = event.getBody();
+      indexRequest.setIndex(indexName).setType(indexType).setSource(event.getBody());
+    }
+
+    @Override
+    public void configure(Context arg0) {
+      hasContext = true;
+    }
+
+    @Override
+    public void configure(ComponentConfiguration arg0) {
+      //no-op
+    }
+  }
+
+  @Test
+  public void shouldFailToConfigureWithInvalidSerializerClass()
+      throws Exception {
+
+    parameters.put(SERIALIZER, "java.lang.String");
+    try {
+      Configurables.configure(fixture, new Context(parameters));
+    } catch (ClassCastException e) {
+      // expected
+    }
+
+    parameters.put(SERIALIZER, FakeConfigurable.class.getName());
+    try {
+      Configurables.configure(fixture, new Context(parameters));
+    } catch (IllegalArgumentException e) {
+      // expected
+    }
+  }
+
+  @Test
+  public void shouldUseSpecifiedSerializer() throws Exception {
+    Context context = new Context();
+    context.put(SERIALIZER,
+        "org.apache.flume.sink.elasticsearch.FakeEventSerializer");
+
+    assertNull(fixture.getEventSerializer());
+    fixture.configure(context);
+    assertTrue(fixture.getEventSerializer() instanceof FakeEventSerializer);
+  }
+
+  @Test
+  public void shouldUseSpecifiedIndexNameBuilder() throws Exception {
+    Context context = new Context();
+    context.put(ElasticSearchSinkConstants.INDEX_NAME_BUILDER,
+            "org.apache.flume.sink.elasticsearch.FakeIndexNameBuilder");
+
+    assertNull(fixture.getIndexNameBuilder());
+    fixture.configure(context);
+    assertTrue(fixture.getIndexNameBuilder() instanceof FakeIndexNameBuilder);
+  }
+
+  public static class FakeConfigurable implements Configurable {
+    @Override
+    public void configure(Context arg0) {
+      // no-op
+    }
+  }
+}
+
+/**
+ * Internal class. Fake event serializer used for tests
+ */
+class FakeEventSerializer implements ElasticSearchEventSerializer {
+
+  static final byte[] FAKE_BYTES = new byte[] { 9, 8, 7, 6 };
+  boolean configuredWithContext;
+  boolean configuredWithComponentConfiguration;
+
+  @Override
+  public BytesStream getContentBuilder(Event event) throws IOException {
+    FastByteArrayOutputStream fbaos = new FastByteArrayOutputStream(4);
+    fbaos.write(FAKE_BYTES);
+    return fbaos;
+  }
+
+  @Override
+  public void configure(Context arg0) {
+    configuredWithContext = true;
+  }
+
+  @Override
+  public void configure(ComponentConfiguration arg0) {
+    configuredWithComponentConfiguration = true;
+  }
+}
+
+/**
+ * Internal class. Fake index name builder used only for tests.
+ */
+class FakeIndexNameBuilder implements IndexNameBuilder {
+
+  static final String INDEX_NAME = "index_name";
+
+  @Override
+  public String getIndexName(Event event) {
+    return INDEX_NAME;
+  }
+
+  @Override
+  public String getIndexPrefix(Event event) {
+    return INDEX_NAME;
+  }
+
+  @Override
+  public void configure(Context context) {
+  }
+
+  @Override
+  public void configure(ComponentConfiguration conf) {
+  }
+}
diff --git a/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/TestElasticSearchSinkCreation.java b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/TestElasticSearchSinkCreation.java
new file mode 100644
index 0000000..2a36439
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/TestElasticSearchSinkCreation.java
@@ -0,0 +1,49 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.elasticsearch;
+
+import org.apache.flume.FlumeException;
+import org.apache.flume.Sink;
+import org.apache.flume.SinkFactory;
+import org.apache.flume.sink.DefaultSinkFactory;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+
+public class TestElasticSearchSinkCreation {
+
+  private SinkFactory sinkFactory;
+
+  @Before
+  public void setUp() {
+    sinkFactory = new DefaultSinkFactory();
+  }
+
+  private void verifySinkCreation(String name, String type,
+      Class<?> typeClass) throws FlumeException {
+    Sink sink = sinkFactory.create(name, type);
+    Assert.assertNotNull(sink);
+    Assert.assertTrue(typeClass.isInstance(sink));
+  }
+
+  @Test
+  public void testSinkCreation() {
+    verifySinkCreation("elasticsearch-sink", "elasticsearch", ElasticSearchSink.class);
+  }
+}
diff --git a/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/TimeBasedIndexNameBuilderTest.java b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/TimeBasedIndexNameBuilderTest.java
new file mode 100644
index 0000000..678342a
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/TimeBasedIndexNameBuilderTest.java
@@ -0,0 +1,60 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.elasticsearch;
+
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.event.SimpleEvent;
+import org.junit.Before;
+import org.junit.Test;
+
+import java.util.HashMap;
+import java.util.Map;
+
+import static org.junit.Assert.assertEquals;
+
+public class TimeBasedIndexNameBuilderTest {
+
+  private TimeBasedIndexNameBuilder indexNameBuilder;
+
+  @Before
+  public void setUp() throws Exception {
+    Context context = new Context();
+    context.put(ElasticSearchSinkConstants.INDEX_NAME, "prefix");
+    indexNameBuilder = new TimeBasedIndexNameBuilder();
+    indexNameBuilder.configure(context);
+  }
+
+  @Test
+  public void shouldUseUtcAsBasisForDateFormat() {
+    assertEquals("Coordinated Universal Time",
+            indexNameBuilder.getFastDateFormat().getTimeZone().getDisplayName());
+  }
+
+  @Test
+  public void indexNameShouldBePrefixDashFormattedTimestamp() {
+    long time = 987654321L;
+    Event event = new SimpleEvent();
+    Map<String, String> headers = new HashMap<String, String>();
+    headers.put("timestamp", Long.toString(time));
+    event.setHeaders(headers);
+    assertEquals("prefix-" + indexNameBuilder.getFastDateFormat().format(time),
+        indexNameBuilder.getIndexName(event));
+  }
+}
diff --git a/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/TimestampedEventTest.java b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/TimestampedEventTest.java
new file mode 100644
index 0000000..bef2ac6
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/TimestampedEventTest.java
@@ -0,0 +1,88 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.elasticsearch;
+
+import com.google.common.collect.Maps;
+import org.apache.flume.event.SimpleEvent;
+import org.joda.time.DateTimeUtils;
+import org.junit.Before;
+import org.junit.Test;
+
+import java.util.Map;
+
+import static org.junit.Assert.assertArrayEquals;
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertNull;
+
+public class TimestampedEventTest {
+  static final long FIXED_TIME_MILLIS = 123456789L;
+
+  @Before
+  public void setFixedJodaTime() {
+    DateTimeUtils.setCurrentMillisFixed(FIXED_TIME_MILLIS);
+  }
+
+  @Test
+  public void shouldEnsureTimestampHeaderPresentInTimestampedEvent() {
+    SimpleEvent base = new SimpleEvent();
+
+    TimestampedEvent timestampedEvent = new TimestampedEvent(base);
+    assertEquals(FIXED_TIME_MILLIS, timestampedEvent.getTimestamp());
+    assertEquals(String.valueOf(FIXED_TIME_MILLIS),
+            timestampedEvent.getHeaders().get("timestamp"));
+  }
+
+  @Test
+  public void shouldUseExistingTimestampHeaderInTimestampedEvent() {
+    SimpleEvent base = new SimpleEvent();
+    Map<String, String> headersWithTimestamp = Maps.newHashMap();
+    headersWithTimestamp.put("timestamp", "-321");
+    base.setHeaders(headersWithTimestamp );
+
+    TimestampedEvent timestampedEvent = new TimestampedEvent(base);
+    assertEquals(-321L, timestampedEvent.getTimestamp());
+    assertEquals("-321", timestampedEvent.getHeaders().get("timestamp"));
+  }
+
+  @Test
+  public void shouldUseExistingAtTimestampHeaderInTimestampedEvent() {
+    SimpleEvent base = new SimpleEvent();
+    Map<String, String> headersWithTimestamp = Maps.newHashMap();
+    headersWithTimestamp.put("@timestamp", "-999");
+    base.setHeaders(headersWithTimestamp );
+
+    TimestampedEvent timestampedEvent = new TimestampedEvent(base);
+    assertEquals(-999L, timestampedEvent.getTimestamp());
+    assertEquals("-999", timestampedEvent.getHeaders().get("@timestamp"));
+    assertNull(timestampedEvent.getHeaders().get("timestamp"));
+  }
+
+  @Test
+  public void shouldPreserveBodyAndNonTimestampHeadersInTimestampedEvent() {
+    SimpleEvent base = new SimpleEvent();
+    base.setBody(new byte[] {1,2,3,4});
+    Map<String, String> headersWithTimestamp = Maps.newHashMap();
+    headersWithTimestamp.put("foo", "bar");
+    base.setHeaders(headersWithTimestamp );
+
+    TimestampedEvent timestampedEvent = new TimestampedEvent(base);
+    assertEquals("bar", timestampedEvent.getHeaders().get("foo"));
+    assertArrayEquals(base.getBody(), timestampedEvent.getBody());
+  }
+}
diff --git a/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/client/RoundRobinListTest.java b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/client/RoundRobinListTest.java
new file mode 100644
index 0000000..0d1d092
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/client/RoundRobinListTest.java
@@ -0,0 +1,42 @@
+/*
+ * Copyright 2014 Apache Software Foundation.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flume.sink.elasticsearch.client;
+
+import java.util.Arrays;
+import org.junit.Before;
+import org.junit.Test;
+
+import static org.junit.Assert.assertEquals;
+
+public class RoundRobinListTest {
+
+  private RoundRobinList<String> fixture;
+
+  @Before
+  public void setUp() {
+    fixture = new RoundRobinList<String>(Arrays.asList("test1", "test2"));
+  }
+
+  @Test
+  public void shouldReturnNextElement() {
+    assertEquals("test1", fixture.get());
+    assertEquals("test2", fixture.get());
+    assertEquals("test1", fixture.get());
+    assertEquals("test2", fixture.get());
+    assertEquals("test1", fixture.get());
+  }
+}
diff --git a/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/client/TestElasticSearchClientFactory.java b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/client/TestElasticSearchClientFactory.java
new file mode 100644
index 0000000..c3f07b0
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/client/TestElasticSearchClientFactory.java
@@ -0,0 +1,64 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.elasticsearch.client;
+
+import org.apache.flume.sink.elasticsearch.ElasticSearchEventSerializer;
+import org.junit.Before;
+import org.junit.Test;
+import org.mockito.Mock;
+
+import static org.hamcrest.core.IsInstanceOf.instanceOf;
+import static org.junit.Assert.assertThat;
+import static org.mockito.MockitoAnnotations.initMocks;
+
+public class TestElasticSearchClientFactory {
+
+  ElasticSearchClientFactory factory;
+  
+  @Mock
+  ElasticSearchEventSerializer serializer;
+
+  @Before
+  public void setUp() {
+    initMocks(this);
+    factory = new ElasticSearchClientFactory();
+  }
+
+  @Test
+  public void shouldReturnTransportClient() throws Exception {
+    String[] hostNames = { "127.0.0.1" };
+    Object o = factory.getClient(ElasticSearchClientFactory.TransportClient,
+                                 hostNames, "test", serializer, null);
+    assertThat(o, instanceOf(ElasticSearchTransportClient.class));
+  }
+
+  @Test
+  public void shouldReturnRestClient() throws NoSuchClientTypeException {
+    String[] hostNames = { "127.0.0.1" };
+    Object o = factory.getClient(ElasticSearchClientFactory.RestClient,
+                                 hostNames, "test", serializer, null);
+    assertThat(o, instanceOf(ElasticSearchRestClient.class));
+  }
+
+  @Test(expected = NoSuchClientTypeException.class)
+  public void shouldThrowNoSuchClientTypeException() throws NoSuchClientTypeException {
+    String[] hostNames = { "127.0.0.1" };
+    factory.getClient("not_existing_client", hostNames, "test", null, null);
+  }
+}
diff --git a/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/client/TestElasticSearchRestClient.java b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/client/TestElasticSearchRestClient.java
new file mode 100644
index 0000000..9551c81
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/client/TestElasticSearchRestClient.java
@@ -0,0 +1,180 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.elasticsearch.client;
+
+import com.google.common.base.Splitter;
+import com.google.gson.JsonObject;
+import com.google.gson.JsonParser;
+import org.apache.flume.Event;
+import org.apache.flume.EventDeliveryException;
+import org.apache.flume.sink.elasticsearch.ElasticSearchEventSerializer;
+import org.apache.flume.sink.elasticsearch.IndexNameBuilder;
+import org.apache.http.HttpEntity;
+import org.apache.http.HttpResponse;
+import org.apache.http.HttpStatus;
+import org.apache.http.StatusLine;
+import org.apache.http.client.HttpClient;
+import org.apache.http.client.methods.HttpPost;
+import org.apache.http.client.methods.HttpUriRequest;
+import org.apache.http.util.EntityUtils;
+import org.elasticsearch.common.bytes.BytesArray;
+import org.elasticsearch.common.bytes.BytesReference;
+import org.elasticsearch.common.io.BytesStream;
+import org.junit.Before;
+import org.junit.Test;
+import org.mockito.ArgumentCaptor;
+import org.mockito.Mock;
+
+import java.io.IOException;
+import java.util.Iterator;
+import java.util.List;
+
+import static junit.framework.Assert.assertEquals;
+import static junit.framework.Assert.assertTrue;
+import static org.mockito.Mockito.any;
+import static org.mockito.Mockito.isA;
+import static org.mockito.Mockito.mock;
+import static org.mockito.Mockito.times;
+import static org.mockito.Mockito.verify;
+import static org.mockito.Mockito.when;
+import static org.mockito.MockitoAnnotations.initMocks;
+
+public class TestElasticSearchRestClient {
+
+  private ElasticSearchRestClient fixture;
+
+  @Mock
+  private ElasticSearchEventSerializer serializer;
+
+  @Mock
+  private IndexNameBuilder nameBuilder;
+  
+  @Mock
+  private Event event;
+
+  @Mock
+  private HttpClient httpClient;
+
+  @Mock
+  private HttpResponse httpResponse;
+
+  @Mock
+  private StatusLine httpStatus;
+
+  @Mock
+  private HttpEntity httpEntity;
+
+  private static final String INDEX_NAME = "foo_index";
+  private static final String MESSAGE_CONTENT = "{\"body\":\"test\"}";
+  private static final String[] HOSTS = {"host1", "host2"};
+
+  @Before
+  public void setUp() throws IOException {
+    initMocks(this);
+    BytesReference bytesReference = mock(BytesReference.class);
+    BytesStream bytesStream = mock(BytesStream.class);
+
+    when(nameBuilder.getIndexName(any(Event.class))).thenReturn(INDEX_NAME);
+    when(bytesReference.toBytesArray()).thenReturn(new BytesArray(MESSAGE_CONTENT));
+    when(bytesStream.bytes()).thenReturn(bytesReference);
+    when(serializer.getContentBuilder(any(Event.class))).thenReturn(bytesStream);
+    fixture = new ElasticSearchRestClient(HOSTS, serializer, httpClient);
+  }
+
+  @Test
+  public void shouldAddNewEventWithoutTTL() throws Exception {
+    ArgumentCaptor<HttpPost> argument = ArgumentCaptor.forClass(HttpPost.class);
+
+    when(httpStatus.getStatusCode()).thenReturn(HttpStatus.SC_OK);
+    when(httpResponse.getStatusLine()).thenReturn(httpStatus);
+    when(httpClient.execute(any(HttpUriRequest.class))).thenReturn(httpResponse);
+    
+    fixture.addEvent(event, nameBuilder, "bar_type", -1);
+    fixture.execute();
+
+    verify(httpClient).execute(isA(HttpUriRequest.class));
+    verify(httpClient).execute(argument.capture());
+
+    assertEquals("http://host1/_bulk", argument.getValue().getURI().toString());
+    assertTrue(verifyJsonEvents("{\"index\":{\"_type\":\"bar_type\", \"_index\":\"foo_index\"}}\n",
+            MESSAGE_CONTENT, EntityUtils.toString(argument.getValue().getEntity())));
+  }
+
+  @Test
+  public void shouldAddNewEventWithTTL() throws Exception {
+    ArgumentCaptor<HttpPost> argument = ArgumentCaptor.forClass(HttpPost.class);
+
+    when(httpStatus.getStatusCode()).thenReturn(HttpStatus.SC_OK);
+    when(httpResponse.getStatusLine()).thenReturn(httpStatus);
+    when(httpClient.execute(any(HttpUriRequest.class))).thenReturn(httpResponse);
+
+    fixture.addEvent(event, nameBuilder, "bar_type", 123);
+    fixture.execute();
+
+    verify(httpClient).execute(isA(HttpUriRequest.class));
+    verify(httpClient).execute(argument.capture());
+
+    assertEquals("http://host1/_bulk", argument.getValue().getURI().toString());
+    assertTrue(verifyJsonEvents(
+        "{\"index\":{\"_type\":\"bar_type\",\"_index\":\"foo_index\",\"_ttl\":\"123\"}}\n",
+        MESSAGE_CONTENT, EntityUtils.toString(argument.getValue().getEntity())));
+  }
+
+  private boolean verifyJsonEvents(String expectedIndex, String expectedBody, String actual) {
+    Iterator<String> it = Splitter.on("\n").split(actual).iterator();
+    JsonParser parser = new JsonParser();
+    JsonObject[] arr = new JsonObject[2];
+    for (int i = 0; i < 2; i++) {
+      arr[i] = (JsonObject) parser.parse(it.next());
+    }
+    return arr[0].equals(parser.parse(expectedIndex)) && arr[1].equals(parser.parse(expectedBody));
+  }
+
+  @Test(expected = EventDeliveryException.class)
+  public void shouldThrowEventDeliveryException() throws Exception {
+    ArgumentCaptor<HttpPost> argument = ArgumentCaptor.forClass(HttpPost.class);
+
+    when(httpStatus.getStatusCode()).thenReturn(HttpStatus.SC_INTERNAL_SERVER_ERROR);
+    when(httpResponse.getStatusLine()).thenReturn(httpStatus);
+    when(httpClient.execute(any(HttpUriRequest.class))).thenReturn(httpResponse);
+
+    fixture.addEvent(event, nameBuilder, "bar_type", 123);
+    fixture.execute();
+  }
+
+  @Test()
+  public void shouldRetryBulkOperation() throws Exception {
+    ArgumentCaptor<HttpPost> argument = ArgumentCaptor.forClass(HttpPost.class);
+
+    when(httpStatus.getStatusCode()).thenReturn(HttpStatus.SC_INTERNAL_SERVER_ERROR,
+                                                HttpStatus.SC_OK);
+    when(httpResponse.getStatusLine()).thenReturn(httpStatus);
+    when(httpClient.execute(any(HttpUriRequest.class))).thenReturn(httpResponse);
+
+    fixture.addEvent(event, nameBuilder, "bar_type", 123);
+    fixture.execute();
+
+    verify(httpClient, times(2)).execute(isA(HttpUriRequest.class));
+    verify(httpClient, times(2)).execute(argument.capture());
+
+    List<HttpPost> allValues = argument.getAllValues();
+    assertEquals("http://host1/_bulk", allValues.get(0).getURI().toString());
+    assertEquals("http://host2/_bulk", allValues.get(1).getURI().toString());
+  }
+}
diff --git a/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/client/TestElasticSearchTransportClient.java b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/client/TestElasticSearchTransportClient.java
new file mode 100644
index 0000000..b7b8e74
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/client/TestElasticSearchTransportClient.java
@@ -0,0 +1,127 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.elasticsearch.client;
+
+import org.apache.flume.Event;
+import org.apache.flume.EventDeliveryException;
+import org.apache.flume.sink.elasticsearch.ElasticSearchEventSerializer;
+import org.apache.flume.sink.elasticsearch.IndexNameBuilder;
+import org.elasticsearch.action.ListenableActionFuture;
+import org.elasticsearch.action.bulk.BulkRequestBuilder;
+import org.elasticsearch.action.bulk.BulkResponse;
+import org.elasticsearch.action.index.IndexRequestBuilder;
+import org.elasticsearch.client.Client;
+import org.elasticsearch.common.bytes.BytesReference;
+import org.elasticsearch.common.io.BytesStream;
+import org.junit.Before;
+import org.junit.Test;
+import org.mockito.Mock;
+
+import java.io.IOException;
+
+import static org.mockito.Matchers.any;
+import static org.mockito.Matchers.anyString;
+import static org.mockito.Mockito.*;
+import static org.mockito.MockitoAnnotations.initMocks;
+
+public class TestElasticSearchTransportClient {
+
+  private ElasticSearchTransportClient fixture;
+
+  @Mock
+  private ElasticSearchEventSerializer serializer;
+
+  @Mock
+  private IndexNameBuilder nameBuilder;
+
+  @Mock
+  private Client elasticSearchClient;
+
+  @Mock
+  private BulkRequestBuilder bulkRequestBuilder;
+
+  @Mock
+  private IndexRequestBuilder indexRequestBuilder;
+
+  @Mock
+  private Event event;
+
+  @Before
+  public void setUp() throws IOException {
+    initMocks(this);
+    BytesReference bytesReference = mock(BytesReference.class);
+    BytesStream bytesStream = mock(BytesStream.class);
+
+    when(nameBuilder.getIndexName(any(Event.class))).thenReturn("foo_index");
+    when(bytesReference.toBytes()).thenReturn("{\"body\":\"test\"}".getBytes());
+    when(bytesStream.bytes()).thenReturn(bytesReference);
+    when(serializer.getContentBuilder(any(Event.class)))
+        .thenReturn(bytesStream);
+    when(elasticSearchClient.prepareIndex(anyString(), anyString()))
+        .thenReturn(indexRequestBuilder);
+    when(indexRequestBuilder.setSource(bytesReference)).thenReturn(
+        indexRequestBuilder);
+
+    fixture = new ElasticSearchTransportClient(elasticSearchClient, serializer);
+    fixture.setBulkRequestBuilder(bulkRequestBuilder);
+  }
+
+  @Test
+  public void shouldAddNewEventWithoutTTL() throws Exception {
+    fixture.addEvent(event, nameBuilder, "bar_type", -1);
+    verify(indexRequestBuilder).setSource(
+        serializer.getContentBuilder(event).bytes());
+    verify(bulkRequestBuilder).add(indexRequestBuilder);
+  }
+
+  @Test
+  public void shouldAddNewEventWithTTL() throws Exception {
+    fixture.addEvent(event, nameBuilder, "bar_type", 10);
+    verify(indexRequestBuilder).setTTL(10);
+    verify(indexRequestBuilder).setSource(
+        serializer.getContentBuilder(event).bytes());
+  }
+
+  @Test
+  public void shouldExecuteBulkRequestBuilder() throws Exception {
+    ListenableActionFuture<BulkResponse> action =
+        (ListenableActionFuture<BulkResponse>) mock(ListenableActionFuture.class);
+    BulkResponse response = mock(BulkResponse.class);
+    when(bulkRequestBuilder.execute()).thenReturn(action);
+    when(action.actionGet()).thenReturn(response);
+    when(response.hasFailures()).thenReturn(false);
+
+    fixture.addEvent(event, nameBuilder, "bar_type", 10);
+    fixture.execute();
+    verify(bulkRequestBuilder).execute();
+  }
+
+  @Test(expected = EventDeliveryException.class)
+  public void shouldThrowExceptionOnExecuteFailed() throws Exception {
+    ListenableActionFuture<BulkResponse> action =
+        (ListenableActionFuture<BulkResponse>) mock(ListenableActionFuture.class);
+    BulkResponse response = mock(BulkResponse.class);
+    when(bulkRequestBuilder.execute()).thenReturn(action);
+    when(action.actionGet()).thenReturn(response);
+    when(response.hasFailures()).thenReturn(true);
+
+    fixture.addEvent(event, nameBuilder, "bar_type", 10);
+    fixture.execute();
+  }
+}
diff --git a/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/resources/log4j.properties b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/resources/log4j.properties
new file mode 100644
index 0000000..9036aca
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/resources/log4j.properties
@@ -0,0 +1,25 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#  http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+
+log4j.rootLogger = DEBUG, out
+
+log4j.appender.out = org.apache.log4j.ConsoleAppender
+log4j.appender.out.layout = org.apache.log4j.PatternLayout
+log4j.appender.out.layout.ConversionPattern = %d (%t) [%p - %l] %m%n
+
+log4j.logger.org.apache.flume = DEBUG
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/.idea/artifacts/flume_ng_hbase_sink_jar.xml b/code/flume-ng-sinks/flume-ng-hbase-sink/.idea/artifacts/flume_ng_hbase_sink_jar.xml
new file mode 100644
index 0000000..f3e9b44
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-hbase-sink/.idea/artifacts/flume_ng_hbase_sink_jar.xml
@@ -0,0 +1,8 @@
+<component name="ArtifactManager">
+  <artifact type="jar" name="flume-ng-hbase-sink:jar">
+    <output-path>$PROJECT_DIR$/out/artifacts/flume_ng_hbase_sink_jar</output-path>
+    <root id="archive" name="flume-ng-hbase-sink.jar">
+      <element id="module-output" name="flume-ng-hbase-sink" />
+    </root>
+  </artifact>
+</component>
\ No newline at end of file
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/.idea/compiler.xml b/code/flume-ng-sinks/flume-ng-hbase-sink/.idea/compiler.xml
new file mode 100644
index 0000000..6e72b1f
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-hbase-sink/.idea/compiler.xml
@@ -0,0 +1,13 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<project version="4">
+  <component name="CompilerConfiguration">
+    <annotationProcessing>
+      <profile name="Maven default annotation processors profile" enabled="true">
+        <sourceOutputDir name="target/generated-sources/annotations" />
+        <sourceTestOutputDir name="target/generated-test-sources/test-annotations" />
+        <outputRelativeToContentRoot value="true" />
+        <module name="flume-ng-hbase-sink" />
+      </profile>
+    </annotationProcessing>
+  </component>
+</project>
\ No newline at end of file
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/.idea/encodings.xml b/code/flume-ng-sinks/flume-ng-hbase-sink/.idea/encodings.xml
new file mode 100644
index 0000000..b26911b
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-hbase-sink/.idea/encodings.xml
@@ -0,0 +1,6 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<project version="4">
+  <component name="Encoding">
+    <file url="file://$PROJECT_DIR$" charset="UTF-8" />
+  </component>
+</project>
\ No newline at end of file
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/.idea/misc.xml b/code/flume-ng-sinks/flume-ng-hbase-sink/.idea/misc.xml
new file mode 100644
index 0000000..4b661a5
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-hbase-sink/.idea/misc.xml
@@ -0,0 +1,14 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<project version="4">
+  <component name="ExternalStorageConfigurationManager" enabled="true" />
+  <component name="MavenProjectsManager">
+    <option name="originalFiles">
+      <list>
+        <option value="$PROJECT_DIR$/pom.xml" />
+      </list>
+    </option>
+  </component>
+  <component name="ProjectRootManager" version="2" languageLevel="JDK_1_8" project-jdk-name="1.8" project-jdk-type="JavaSDK">
+    <output url="file://$PROJECT_DIR$/out" />
+  </component>
+</project>
\ No newline at end of file
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/.idea/workspace.xml b/code/flume-ng-sinks/flume-ng-hbase-sink/.idea/workspace.xml
new file mode 100644
index 0000000..dd63465
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-hbase-sink/.idea/workspace.xml
@@ -0,0 +1,435 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<project version="4">
+  <component name="ArtifactsWorkspaceSettings">
+    <artifacts-to-build>
+      <artifact name="flume-ng-hbase-sink:jar" />
+    </artifacts-to-build>
+  </component>
+  <component name="ChangeListManager">
+    <list default="true" id="a160f075-6493-43c8-89fc-4fc7c29d7b9a" name="Default Changelist" comment="" />
+    <ignored path="$PROJECT_DIR$/target/" />
+    <option name="EXCLUDED_CONVERTED_TO_IGNORED" value="true" />
+    <option name="SHOW_DIALOG" value="false" />
+    <option name="HIGHLIGHT_CONFLICTS" value="true" />
+    <option name="HIGHLIGHT_NON_ACTIVE_CHANGELIST" value="false" />
+    <option name="LAST_RESOLUTION" value="IGNORE" />
+  </component>
+  <component name="FUSProjectUsageTrigger">
+    <session id="-621900688">
+      <usages-collector id="statistics.lifecycle.project">
+        <counts>
+          <entry key="project.closed" value="7" />
+          <entry key="project.open.time.18" value="1" />
+          <entry key="project.open.time.24" value="1" />
+          <entry key="project.open.time.28" value="1" />
+          <entry key="project.open.time.3" value="1" />
+          <entry key="project.open.time.30" value="1" />
+          <entry key="project.open.time.5" value="1" />
+          <entry key="project.open.time.7" value="1" />
+          <entry key="project.opened" value="7" />
+        </counts>
+      </usages-collector>
+      <usages-collector id="statistics.file.extensions.open">
+        <counts>
+          <entry key="class" value="1" />
+          <entry key="java" value="5" />
+          <entry key="xml" value="1" />
+        </counts>
+      </usages-collector>
+      <usages-collector id="statistics.file.types.open">
+        <counts>
+          <entry key="CLASS" value="1" />
+          <entry key="JAVA" value="5" />
+          <entry key="XML" value="1" />
+        </counts>
+      </usages-collector>
+      <usages-collector id="statistics.file.extensions.edit">
+        <counts>
+          <entry key="java" value="6" />
+        </counts>
+      </usages-collector>
+      <usages-collector id="statistics.file.types.edit">
+        <counts>
+          <entry key="JAVA" value="6" />
+        </counts>
+      </usages-collector>
+    </session>
+  </component>
+  <component name="FileEditorManager">
+    <leaf SIDE_TABS_SIZE_LIMIT_KEY="300">
+      <file pinned="false" current-in-tab="false">
+        <entry file="file://$PROJECT_DIR$/src/main/java/org/apache/flume/sink/hbase/SimpleAsyncHbaseEventSerializer.java">
+          <provider selected="true" editor-type-id="text-editor">
+            <state relative-caret-position="-540">
+              <caret line="51" column="66" selection-start-line="51" selection-start-column="66" selection-end-line="51" selection-end-column="66" />
+            </state>
+          </provider>
+        </entry>
+      </file>
+      <file pinned="false" current-in-tab="false">
+        <entry file="file://$PROJECT_DIR$/src/main/java/org/apache/flume/sink/hbase/KfkAsyncHbaseEventSerializer.java">
+          <provider selected="true" editor-type-id="text-editor">
+            <state relative-caret-position="552">
+              <caret line="90" column="59" selection-start-line="90" selection-start-column="59" selection-end-line="90" selection-end-column="59" />
+              <folding>
+                <element signature="e#5615#5616#0" expanded="true" />
+                <element signature="e#5661#5662#0" expanded="true" />
+              </folding>
+            </state>
+          </provider>
+        </entry>
+      </file>
+      <file pinned="false" current-in-tab="true">
+        <entry file="file://$PROJECT_DIR$/src/main/java/org/apache/flume/sink/hbase/SimpleRowKeyGenerator.java">
+          <provider selected="true" editor-type-id="text-editor">
+            <state relative-caret-position="100">
+              <caret line="24" column="3" lean-forward="true" selection-start-line="24" selection-start-column="3" selection-end-line="24" selection-end-column="3" />
+            </state>
+          </provider>
+        </entry>
+      </file>
+      <file pinned="false" current-in-tab="false">
+        <entry file="file://$PROJECT_DIR$/src/main/java/org/apache/flume/sink/hbase/SimpleHbaseEventSerializer.java">
+          <provider selected="true" editor-type-id="text-editor">
+            <state relative-caret-position="2140">
+              <caret line="134" column="11" selection-start-line="134" selection-start-column="11" selection-end-line="134" selection-end-column="11" />
+            </state>
+          </provider>
+        </entry>
+      </file>
+      <file pinned="false" current-in-tab="false">
+        <entry file="jar://$MAVEN_REPOSITORY$/org/apache/flume/flume-ng-configuration/1.7.0/flume-ng-configuration-1.7.0.jar!/org/apache/flume/Context.class">
+          <provider selected="true" editor-type-id="text-editor">
+            <state relative-caret-position="1540">
+              <caret line="106" column="54" selection-start-line="105" selection-start-column="16" selection-end-line="106" selection-end-column="54" />
+            </state>
+          </provider>
+        </entry>
+      </file>
+      <file pinned="false" current-in-tab="false">
+        <entry file="file://$PROJECT_DIR$/src/main/java/org/apache/flume/sink/hbase/AsyncHbaseEventSerializer.java">
+          <provider selected="true" editor-type-id="text-editor">
+            <state relative-caret-position="200">
+              <caret line="33" column="17" selection-start-line="33" selection-start-column="17" selection-end-line="33" selection-end-column="17" />
+            </state>
+          </provider>
+        </entry>
+      </file>
+    </leaf>
+  </component>
+  <component name="FileTemplateManagerImpl">
+    <option name="RECENT_TEMPLATES">
+      <list>
+        <option value="Class" />
+      </list>
+    </option>
+  </component>
+  <component name="IdeDocumentHistory">
+    <option name="CHANGED_PATHS">
+      <list>
+        <option value="$PROJECT_DIR$/src/main/java/org/apache/flume/sink/hbase/KfkAsyncHbaseEventSerializer.java" />
+        <option value="$PROJECT_DIR$/src/main/java/org/apache/flume/sink/hbase/SimpleRowKeyGenerator.java" />
+      </list>
+    </option>
+  </component>
+  <component name="JsBuildToolGruntFileManager" detection-done="true" sorting="DEFINITION_ORDER" />
+  <component name="JsBuildToolPackageJson" detection-done="true" sorting="DEFINITION_ORDER" />
+  <component name="JsGulpfileManager">
+    <detection-done>true</detection-done>
+    <sorting>DEFINITION_ORDER</sorting>
+  </component>
+  <component name="ProjectFrameBounds" extendedState="6">
+    <option name="x" value="493" />
+    <option name="y" value="121" />
+    <option name="width" value="1382" />
+    <option name="height" value="744" />
+  </component>
+  <component name="ProjectView">
+    <navigator proportions="" version="1">
+      <foldersAlwaysOnTop value="true" />
+    </navigator>
+    <panes>
+      <pane id="ProjectPane">
+        <subPane>
+          <expand>
+            <path>
+              <item name="flume-ng-hbase-sink" type="b2602c69:ProjectViewProjectNode" />
+              <item name="flume-ng-hbase-sink" type="462c0819:PsiDirectoryNode" />
+            </path>
+            <path>
+              <item name="flume-ng-hbase-sink" type="b2602c69:ProjectViewProjectNode" />
+              <item name="flume-ng-hbase-sink" type="462c0819:PsiDirectoryNode" />
+              <item name="out" type="462c0819:PsiDirectoryNode" />
+            </path>
+            <path>
+              <item name="flume-ng-hbase-sink" type="b2602c69:ProjectViewProjectNode" />
+              <item name="flume-ng-hbase-sink" type="462c0819:PsiDirectoryNode" />
+              <item name="out" type="462c0819:PsiDirectoryNode" />
+              <item name="artifacts" type="462c0819:PsiDirectoryNode" />
+            </path>
+            <path>
+              <item name="flume-ng-hbase-sink" type="b2602c69:ProjectViewProjectNode" />
+              <item name="flume-ng-hbase-sink" type="462c0819:PsiDirectoryNode" />
+              <item name="src" type="462c0819:PsiDirectoryNode" />
+            </path>
+            <path>
+              <item name="flume-ng-hbase-sink" type="b2602c69:ProjectViewProjectNode" />
+              <item name="flume-ng-hbase-sink" type="462c0819:PsiDirectoryNode" />
+              <item name="src" type="462c0819:PsiDirectoryNode" />
+              <item name="main" type="462c0819:PsiDirectoryNode" />
+            </path>
+            <path>
+              <item name="flume-ng-hbase-sink" type="b2602c69:ProjectViewProjectNode" />
+              <item name="flume-ng-hbase-sink" type="462c0819:PsiDirectoryNode" />
+              <item name="src" type="462c0819:PsiDirectoryNode" />
+              <item name="main" type="462c0819:PsiDirectoryNode" />
+              <item name="java" type="462c0819:PsiDirectoryNode" />
+            </path>
+            <path>
+              <item name="flume-ng-hbase-sink" type="b2602c69:ProjectViewProjectNode" />
+              <item name="flume-ng-hbase-sink" type="462c0819:PsiDirectoryNode" />
+              <item name="src" type="462c0819:PsiDirectoryNode" />
+              <item name="main" type="462c0819:PsiDirectoryNode" />
+              <item name="java" type="462c0819:PsiDirectoryNode" />
+              <item name="hbase" type="462c0819:PsiDirectoryNode" />
+            </path>
+          </expand>
+          <select />
+        </subPane>
+      </pane>
+      <pane id="PackagesPane" />
+      <pane id="Scope" />
+    </panes>
+  </component>
+  <component name="PropertiesComponent">
+    <property name="WebServerToolWindowFactoryState" value="false" />
+    <property name="aspect.path.notification.shown" value="true" />
+    <property name="com.android.tools.idea.instantapp.provision.ProvisionBeforeRunTaskProvider.myTimeStamp" value="1546522572477" />
+    <property name="nodejs_interpreter_path.stuck_in_default_project" value="undefined stuck path" />
+    <property name="nodejs_npm_path_reset_for_default_project" value="true" />
+    <property name="project.structure.last.edited" value="Artifacts" />
+    <property name="project.structure.proportion" value="0.15" />
+    <property name="project.structure.side.proportion" value="0.2" />
+    <property name="settings.editor.selected.configurable" value="preferences.lookFeel" />
+  </component>
+  <component name="RunDashboard">
+    <option name="ruleStates">
+      <list>
+        <RuleState>
+          <option name="name" value="ConfigurationTypeDashboardGroupingRule" />
+        </RuleState>
+        <RuleState>
+          <option name="name" value="StatusDashboardGroupingRule" />
+        </RuleState>
+      </list>
+    </option>
+  </component>
+  <component name="SvnConfiguration">
+    <configuration />
+  </component>
+  <component name="TaskManager">
+    <task active="true" id="Default" summary="Default task">
+      <changelist id="a160f075-6493-43c8-89fc-4fc7c29d7b9a" name="Default Changelist" comment="" />
+      <created>1546438606072</created>
+      <option name="number" value="Default" />
+      <option name="presentableId" value="Default" />
+      <updated>1546438606072</updated>
+      <workItem from="1546438609890" duration="1167000" />
+      <workItem from="1546481683883" duration="12000" />
+      <workItem from="1546520764784" duration="1795000" />
+      <workItem from="1546861613026" duration="2378000" />
+      <workItem from="1546935459895" duration="28000" />
+      <workItem from="1547953965563" duration="767000" />
+      <workItem from="1547966336234" duration="1984000" />
+    </task>
+    <servers />
+  </component>
+  <component name="TimeTrackingManager">
+    <option name="totallyTimeSpent" value="8131000" />
+  </component>
+  <component name="ToolWindowManager">
+    <frame x="-8" y="-8" width="1936" height="1056" extended-state="6" />
+    <editor active="true" />
+    <layout>
+      <window_info active="true" content_ui="combo" id="Project" order="0" visible="true" weight="0.2675906" />
+      <window_info id="Structure" order="1" side_tool="true" weight="0.25" />
+      <window_info id="Image Layers" order="2" />
+      <window_info id="Designer" order="3" />
+      <window_info id="UI Designer" order="4" />
+      <window_info id="Capture Tool" order="5" />
+      <window_info id="Favorites" order="6" side_tool="true" />
+      <window_info anchor="bottom" id="Message" order="0" />
+      <window_info anchor="bottom" id="Find" order="1" />
+      <window_info anchor="bottom" id="Run" order="2" />
+      <window_info anchor="bottom" id="Debug" order="3" weight="0.4" />
+      <window_info anchor="bottom" id="Cvs" order="4" weight="0.25" />
+      <window_info anchor="bottom" id="Inspection" order="5" weight="0.4" />
+      <window_info anchor="bottom" id="TODO" order="6" />
+      <window_info anchor="bottom" id="Version Control" order="7" show_stripe_button="false" />
+      <window_info anchor="bottom" id="Database Changes" order="8" show_stripe_button="false" />
+      <window_info anchor="bottom" id="Terminal" order="9" />
+      <window_info anchor="bottom" id="Event Log" order="10" side_tool="true" weight="0.329718" />
+      <window_info anchor="bottom" id="Java Enterprise" order="11" />
+      <window_info active="true" anchor="bottom" id="Messages" order="12" visible="true" weight="0.329718" />
+      <window_info anchor="right" id="Commander" internal_type="SLIDING" order="0" type="SLIDING" weight="0.4" />
+      <window_info anchor="right" id="Ant Build" order="1" weight="0.25" />
+      <window_info anchor="right" content_ui="combo" id="Hierarchy" order="2" weight="0.25" />
+      <window_info anchor="right" id="Palette" order="3" />
+      <window_info anchor="right" id="Capture Analysis" order="4" />
+      <window_info anchor="right" id="Database" order="5" />
+      <window_info anchor="right" id="Theme Preview" order="6" />
+      <window_info anchor="right" id="Palette&#9;" order="7" />
+      <window_info anchor="right" id="Maven Projects" order="8" />
+    </layout>
+  </component>
+  <component name="TypeScriptGeneratedFilesManager">
+    <option name="version" value="1" />
+  </component>
+  <component name="VcsContentAnnotationSettings">
+    <option name="myLimit" value="2678400000" />
+  </component>
+  <component name="editorHistoryManager">
+    <entry file="file://$PROJECT_DIR$/pom.xml">
+      <provider selected="true" editor-type-id="text-editor">
+        <state relative-caret-position="-1640">
+          <caret line="17" column="20" selection-start-line="17" selection-start-column="20" selection-end-line="17" selection-end-column="20" />
+        </state>
+      </provider>
+    </entry>
+    <entry file="file://$PROJECT_DIR$/src/main/java/org/apache/flume/sink/hbase/SimpleAsyncHbaseEventSerializer.java">
+      <provider selected="true" editor-type-id="text-editor">
+        <state relative-caret-position="-540">
+          <caret line="51" column="66" selection-start-line="51" selection-start-column="66" selection-end-line="51" selection-end-column="66" />
+        </state>
+      </provider>
+    </entry>
+    <entry file="file://$PROJECT_DIR$/src/main/java/org/apache/flume/sink/hbase/SimpleHbaseEventSerializer.java">
+      <provider selected="true" editor-type-id="text-editor">
+        <state relative-caret-position="2140">
+          <caret line="134" column="11" selection-start-line="134" selection-start-column="11" selection-end-line="134" selection-end-column="11" />
+        </state>
+      </provider>
+    </entry>
+    <entry file="jar://$MAVEN_REPOSITORY$/org/apache/flume/flume-ng-configuration/1.7.0/flume-ng-configuration-1.7.0.jar!/org/apache/flume/Context.class">
+      <provider selected="true" editor-type-id="text-editor">
+        <state relative-caret-position="1540">
+          <caret line="106" column="54" selection-start-line="105" selection-start-column="16" selection-end-line="106" selection-end-column="54" />
+        </state>
+      </provider>
+    </entry>
+    <entry file="file://$PROJECT_DIR$/src/main/java/org/apache/flume/sink/hbase/AsyncHbaseEventSerializer.java">
+      <provider selected="true" editor-type-id="text-editor">
+        <state relative-caret-position="200">
+          <caret line="33" column="17" selection-start-line="33" selection-start-column="17" selection-end-line="33" selection-end-column="17" />
+        </state>
+      </provider>
+    </entry>
+    <entry file="file://$PROJECT_DIR$/src/main/java/org/apache/flume/sink/hbase/KfkAsyncHbaseEventSerializer.java">
+      <provider selected="true" editor-type-id="text-editor">
+        <state relative-caret-position="552">
+          <caret line="90" column="59" selection-start-line="90" selection-start-column="59" selection-end-line="90" selection-end-column="59" />
+          <folding>
+            <element signature="e#5615#5616#0" expanded="true" />
+            <element signature="e#5661#5662#0" expanded="true" />
+          </folding>
+        </state>
+      </provider>
+    </entry>
+    <entry file="file://$PROJECT_DIR$/src/main/java/org/apache/flume/sink/hbase/SimpleRowKeyGenerator.java">
+      <provider selected="true" editor-type-id="text-editor">
+        <state relative-caret-position="100">
+          <caret line="24" column="3" lean-forward="true" selection-start-line="24" selection-start-column="3" selection-end-line="24" selection-end-column="3" />
+        </state>
+      </provider>
+    </entry>
+  </component>
+  <component name="masterDetails">
+    <states>
+      <state key="ArtifactsStructureConfigurable.UI">
+        <settings>
+          <artifact-editor />
+          <last-edited>flume-ng-hbase-sink:jar</last-edited>
+          <splitter-proportions>
+            <option name="proportions">
+              <list>
+                <option value="0.2" />
+                <option value="0.5" />
+              </list>
+            </option>
+          </splitter-proportions>
+        </settings>
+      </state>
+      <state key="FacetStructureConfigurable.UI">
+        <settings>
+          <last-edited>No facets are configured</last-edited>
+          <splitter-proportions>
+            <option name="proportions">
+              <list>
+                <option value="0.2" />
+              </list>
+            </option>
+          </splitter-proportions>
+        </settings>
+      </state>
+      <state key="GlobalLibrariesConfigurable.UI">
+        <settings>
+          <splitter-proportions>
+            <option name="proportions">
+              <list>
+                <option value="0.2" />
+              </list>
+            </option>
+          </splitter-proportions>
+        </settings>
+      </state>
+      <state key="JdkListConfigurable.UI">
+        <settings>
+          <last-edited>1.8</last-edited>
+          <splitter-proportions>
+            <option name="proportions">
+              <list>
+                <option value="0.2" />
+              </list>
+            </option>
+          </splitter-proportions>
+        </settings>
+      </state>
+      <state key="ModuleStructureConfigurable.UI">
+        <settings>
+          <last-edited>flume-ng-hbase-sink</last-edited>
+          <splitter-proportions>
+            <option name="proportions">
+              <list>
+                <option value="0.2" />
+              </list>
+            </option>
+          </splitter-proportions>
+        </settings>
+      </state>
+      <state key="ProjectJDKs.UI">
+        <settings>
+          <last-edited>1.8</last-edited>
+          <splitter-proportions>
+            <option name="proportions">
+              <list>
+                <option value="0.2" />
+              </list>
+            </option>
+          </splitter-proportions>
+        </settings>
+      </state>
+      <state key="ProjectLibrariesConfigurable.UI">
+        <settings>
+          <last-edited>Maven: aopalliance:aopalliance:1.0</last-edited>
+          <splitter-proportions>
+            <option name="proportions">
+              <list>
+                <option value="0.2" />
+              </list>
+            </option>
+          </splitter-proportions>
+        </settings>
+      </state>
+    </states>
+  </component>
+</project>
\ No newline at end of file
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/out/artifacts/flume_ng_hbase_sink_jar/flume-ng-hbase-sink.jar b/code/flume-ng-sinks/flume-ng-hbase-sink/out/artifacts/flume_ng_hbase_sink_jar/flume-ng-hbase-sink.jar
new file mode 100644
index 0000000..e1bcfa7
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-hbase-sink/out/artifacts/flume_ng_hbase_sink_jar/flume-ng-hbase-sink.jar differ
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/pom.xml b/code/flume-ng-sinks/flume-ng-hbase-sink/pom.xml
new file mode 100644
index 0000000..66ffa4d
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-hbase-sink/pom.xml
@@ -0,0 +1,255 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!-- Licensed to the Apache Software Foundation (ASF) under one or more contributor
+  license agreements. See the NOTICE file distributed with this work for additional
+  information regarding copyright ownership. The ASF licenses this file to
+  You under the Apache License, Version 2.0 (the "License"); you may not use
+  this file except in compliance with the License. You may obtain a copy of
+  the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required
+  by applicable law or agreed to in writing, software distributed under the
+  License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS
+  OF ANY KIND, either express or implied. See the License for the specific
+  language governing permissions and limitations under the License. -->
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
+  <modelVersion>4.0.0</modelVersion>
+  <parent>
+    <artifactId>flume-ng-sinks</artifactId>
+    <groupId>org.apache.flume</groupId>
+    <version>1.7.0</version>
+  </parent>
+  <groupId>org.apache.flume.flume-ng-sinks</groupId>
+  <artifactId>flume-ng-hbase-sink</artifactId>
+  <name>Flume NG HBase Sink</name>
+
+  <build>
+    <plugins>
+      <plugin>
+        <groupId>org.apache.rat</groupId>
+        <artifactId>apache-rat-plugin</artifactId>
+      </plugin>
+    </plugins>
+  </build>
+  <dependencies>
+
+    <dependency>
+      <groupId>org.apache.flume</groupId>
+      <artifactId>flume-ng-sdk</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.flume</groupId>
+      <artifactId>flume-ng-core</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.flume</groupId>
+      <artifactId>flume-ng-configuration</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>org.slf4j</groupId>
+      <artifactId>slf4j-api</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>com.google.guava</groupId>
+      <artifactId>guava</artifactId>
+    </dependency>
+
+
+    <dependency>
+      <groupId>org.hbase</groupId>
+      <artifactId>asynchbase</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>io.netty</groupId>
+      <artifactId>netty</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>org.slf4j</groupId>
+      <artifactId>slf4j-log4j12</artifactId>
+      <scope>test</scope>
+    </dependency>
+
+    <dependency>
+      <groupId>junit</groupId>
+      <artifactId>junit</artifactId>
+      <scope>test</scope>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.hadoop</groupId>
+      <artifactId>${hadoop.common.artifact.id}</artifactId>
+      <optional>true</optional>
+    </dependency>
+
+    <dependency>
+      <groupId>commons-io</groupId>
+      <artifactId>commons-io</artifactId>
+      <scope>test</scope>
+    </dependency>
+
+    <dependency>
+      <groupId>commons-lang</groupId>
+      <artifactId>commons-lang</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>org.mockito</groupId>
+      <artifactId>mockito-all</artifactId>
+      <scope>test</scope>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.flume.flume-ng-sinks</groupId>
+      <artifactId>flume-hdfs-sink</artifactId>
+    </dependency>
+
+  </dependencies>
+
+  <profiles>
+    <profile>
+      <id>hadoop-1.0</id>
+      <activation>
+        <property>
+          <name>flume.hadoop.profile</name>
+          <value>1</value>
+        </property>
+      </activation>
+      <dependencies>
+        <dependency>
+          <groupId>org.apache.hadoop</groupId>
+          <artifactId>hadoop-test</artifactId>
+          <scope>test</scope>
+        </dependency>
+        <!-- required because the hadoop-core pom is missing these deps
+            and MiniDFSCluster pulls in the webhdfs classes -->
+        <dependency>
+          <groupId>com.sun.jersey</groupId>
+          <artifactId>jersey-core</artifactId>
+          <scope>test</scope>
+        </dependency>
+        <dependency>
+          <groupId>org.apache.hbase</groupId>
+          <artifactId>hbase</artifactId>
+          <optional>true</optional>
+        </dependency>
+
+        <dependency>
+          <groupId>org.apache.hbase</groupId>
+          <artifactId>hbase</artifactId>
+          <classifier>tests</classifier>
+          <scope>test</scope>
+        </dependency>
+
+        <dependency>
+          <groupId>org.apache.zookeeper</groupId>
+          <artifactId>zookeeper</artifactId>
+          <scope>test</scope>
+        </dependency>
+      </dependencies>
+    </profile>
+    <profile>
+      <id>hadoop-2</id>
+      <activation>
+        <property>
+          <name>flume.hadoop.profile</name>
+          <value>2</value>
+        </property>
+      </activation>
+      <dependencies>
+        <dependency>
+          <groupId>org.apache.hadoop</groupId>
+          <artifactId>hadoop-minicluster</artifactId>
+          <scope>test</scope>
+        </dependency>
+        <dependency>
+          <groupId>org.apache.hbase</groupId>
+          <artifactId>hbase</artifactId>
+          <optional>true</optional>
+        </dependency>
+
+        <dependency>
+          <groupId>org.apache.hbase</groupId>
+          <artifactId>hbase</artifactId>
+          <classifier>tests</classifier>
+          <scope>test</scope>
+        </dependency>
+
+        <dependency>
+          <groupId>org.apache.zookeeper</groupId>
+          <artifactId>zookeeper</artifactId>
+          <scope>test</scope>
+        </dependency>
+      </dependencies>
+    </profile>
+    <profile>
+      <id>hbase-1</id>
+      <activation>
+        <property>
+          <name>!flume.hadoop.profile</name>
+        </property>
+      </activation>
+      <dependencies>
+        <dependency>
+          <groupId>org.apache.hadoop</groupId>
+          <artifactId>hadoop-minicluster</artifactId>
+          <scope>test</scope>
+        </dependency>
+
+        <dependency>
+          <groupId>org.apache.hbase</groupId>
+          <artifactId>hbase-client</artifactId>
+          <optional>true</optional>
+        </dependency>
+
+        <dependency>
+          <groupId>org.apache.hbase</groupId>
+          <artifactId>hbase-client</artifactId>
+          <classifier>tests</classifier>
+          <scope>test</scope>
+        </dependency>
+
+
+        <dependency>
+          <groupId>org.apache.hbase</groupId>
+          <artifactId>hbase-server</artifactId>
+          <scope>test</scope>
+        </dependency>
+
+        <dependency>
+          <groupId>org.apache.hbase</groupId>
+          <artifactId>hbase-server</artifactId>
+          <classifier>tests</classifier>
+          <scope>test</scope>
+        </dependency>
+
+        <!-- There should be no need for Flume to include the following two
+         artifacts, but HBase pom has a bug which causes these to not get
+         pulled in. So we have to pull it in. Ideally this should be optional,
+         but making it optional causes build to fail.
+        -->
+
+        <dependency>
+          <groupId>org.apache.hbase</groupId>
+          <artifactId>hbase-common</artifactId>
+          <optional>true</optional>
+        </dependency>
+        <dependency>
+          <groupId>org.apache.hbase</groupId>
+          <artifactId>hbase-testing-util</artifactId>
+          <scope>test</scope>
+        </dependency>
+
+        <dependency>
+          <groupId>org.apache.zookeeper</groupId>
+          <artifactId>zookeeper</artifactId>
+          <scope>test</scope>
+        </dependency>
+      </dependencies>
+    </profile>
+  </profiles>
+
+</project>
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/AsyncHBaseSink.java b/code/flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/AsyncHBaseSink.java
new file mode 100644
index 0000000..f120f59
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/AsyncHBaseSink.java
@@ -0,0 +1,708 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.hbase;
+
+import com.google.common.annotations.VisibleForTesting;
+import com.google.common.base.Charsets;
+import com.google.common.base.Preconditions;
+import com.google.common.base.Throwables;
+import com.google.common.collect.Maps;
+import com.google.common.primitives.UnsignedBytes;
+import com.google.common.util.concurrent.ThreadFactoryBuilder;
+import com.stumbleupon.async.Callback;
+import org.apache.flume.Channel;
+import org.apache.flume.ChannelException;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.EventDeliveryException;
+import org.apache.flume.FlumeException;
+import org.apache.flume.Transaction;
+import org.apache.flume.conf.Configurable;
+import org.apache.flume.instrumentation.SinkCounter;
+import org.apache.flume.sink.AbstractSink;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.hbase.HBaseConfiguration;
+import org.apache.hadoop.hbase.HConstants;
+import org.apache.hadoop.hbase.zookeeper.ZKConfig;
+import org.hbase.async.AtomicIncrementRequest;
+import org.hbase.async.HBaseClient;
+import org.hbase.async.PutRequest;
+import org.jboss.netty.channel.socket.nio.NioClientSocketChannelFactory;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.Comparator;
+import java.util.List;
+import java.util.Map;
+import java.util.concurrent.CountDownLatch;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.atomic.AtomicBoolean;
+import java.util.concurrent.atomic.AtomicInteger;
+import java.util.concurrent.locks.Condition;
+import java.util.concurrent.locks.Lock;
+import java.util.concurrent.locks.ReentrantLock;
+
+/**
+ * A simple sink which reads events from a channel and writes them to HBase.
+ * This Sink uses an asynchronous API internally and is likely to
+ * perform better.
+ * The Hbase configuration is picked up from the first <tt>hbase-site.xml</tt>
+ * encountered in the classpath. This sink supports batch reading of
+ * events from the channel, and writing them to Hbase, to minimize the number
+ * of flushes on the hbase tables. To use this sink, it has to be configured
+ * with certain mandatory parameters:<p>
+ * <p>
+ * <tt>table: </tt> The name of the table in Hbase to write to. <p>
+ * <tt>columnFamily: </tt> The column family in Hbase to write to.<p>
+ * Other optional parameters are:<p>
+ * <tt>serializer:</tt> A class implementing
+ * {@link AsyncHbaseEventSerializer}.
+ * An instance of
+ * this class will be used to serialize events which are written to hbase.<p>
+ * <tt>serializer.*:</tt> Passed in the <code>configure()</code> method to
+ * serializer
+ * as an object of {@link org.apache.flume.Context}.<p>
+ * <tt>batchSize: </tt>This is the batch size used by the client. This is the
+ * maximum number of events the sink will commit per transaction. The default
+ * batch size is 100 events.
+ * <p>
+ * <tt>timeout: </tt> The length of time in milliseconds the sink waits for
+ * callbacks from hbase for all events in a transaction.
+ * If no timeout is specified, the sink will wait forever.<p>
+ * <p>
+ * <strong>Note: </strong> Hbase does not guarantee atomic commits on multiple
+ * rows. So if a subset of events in a batch are written to disk by Hbase and
+ * Hbase fails, the flume transaction is rolled back, causing flume to write
+ * all the events in the transaction all over again, which will cause
+ * duplicates. The serializer is expected to take care of the handling of
+ * duplicates etc. HBase also does not support batch increments, so if
+ * multiple increments are returned by the serializer, then HBase failure
+ * will cause them to be re-written, when HBase comes back up.
+ */
+public class AsyncHBaseSink extends AbstractSink implements Configurable {
+
+  private String tableName;
+  private byte[] columnFamily;
+  private long batchSize;
+  private static final Logger logger = LoggerFactory.getLogger(AsyncHBaseSink.class);
+  private AsyncHbaseEventSerializer serializer;
+  private String eventSerializerType;
+  private Context serializerContext;
+  private HBaseClient client;
+  private Configuration conf;
+  private Transaction txn;
+  private volatile boolean open = false;
+  private SinkCounter sinkCounter;
+  private long timeout;
+  private String zkQuorum;
+  private String zkBaseDir;
+  private ExecutorService sinkCallbackPool;
+  private boolean isTimeoutTest;
+  private boolean isCoalesceTest;
+  private boolean enableWal = true;
+  private boolean batchIncrements = false;
+  private volatile int totalCallbacksReceived = 0;
+  private int maxConsecutiveFails;
+  private Map<CellIdentifier, AtomicIncrementRequest> incrementBuffer;
+  // The HBaseClient buffers the requests until a callback is received. In the event of a
+  // timeout, there is no way to clear these buffers. If there is a major cluster issue, this
+  // buffer can become too big and cause crashes. So if we hit a fixed number of HBase write
+  // failures/timeouts, then close the HBase Client (gracefully or not) and force a GC to get rid
+  // of the buffered data.
+  private int consecutiveHBaseFailures = 0;
+  private boolean lastTxnFailed = false;
+
+  // Does not need to be thread-safe. Always called only from the sink's
+  // process method.
+  private final Comparator<byte[]> COMPARATOR = UnsignedBytes.lexicographicalComparator();
+
+  public AsyncHBaseSink() {
+    this(null);
+  }
+
+  public AsyncHBaseSink(Configuration conf) {
+    this(conf, false, false);
+  }
+
+  @VisibleForTesting
+  AsyncHBaseSink(Configuration conf, boolean isTimeoutTest,
+                 boolean isCoalesceTest) {
+    this.conf = conf;
+    this.isTimeoutTest = isTimeoutTest;
+    this.isCoalesceTest = isCoalesceTest;
+  }
+
+  @Override
+  public Status process() throws EventDeliveryException {
+    /*
+     * Reference to the boolean representing failure of the current transaction.
+     * Since each txn gets a new boolean, failure of one txn will not affect
+     * the next even if errbacks for the current txn get called while
+     * the next one is being processed.
+     *
+     */
+    if (!open) {
+      throw new EventDeliveryException("Sink was never opened. " +
+          "Please fix the configuration.");
+    }
+    if (client == null) {
+      client = initHBaseClient();
+      if (client == null) {
+        throw new EventDeliveryException("Could not establish connection to HBase!");
+      }
+    }
+    AtomicBoolean txnFail = new AtomicBoolean(false);
+    AtomicInteger callbacksReceived = new AtomicInteger(0);
+    AtomicInteger callbacksExpected = new AtomicInteger(0);
+    final Lock lock = new ReentrantLock();
+    final Condition condition = lock.newCondition();
+    if (incrementBuffer != null) {
+      incrementBuffer.clear();
+    }
+    /*
+     * Callbacks can be reused per transaction, since they share the same
+     * locks and conditions.
+     */
+    Callback<Object, Object> putSuccessCallback =
+        new SuccessCallback<Object, Object>(
+            lock, callbacksReceived, condition);
+    Callback<Object, Exception> putFailureCallback =
+        new FailureCallback<Object, Exception>(
+            lock, callbacksReceived, txnFail, condition);
+
+    Callback<Long, Long> incrementSuccessCallback =
+        new SuccessCallback<Long, Long>(
+            lock, callbacksReceived, condition);
+    Callback<Long, Exception> incrementFailureCallback =
+        new FailureCallback<Long, Exception>(
+            lock, callbacksReceived, txnFail, condition);
+
+    Status status = Status.READY;
+    Channel channel = getChannel();
+    txn = channel.getTransaction();
+    txn.begin();
+
+    int i = 0;
+    try {
+      for (; i < batchSize; i++) {
+        Event event = channel.take();
+        if (event == null) {
+          status = Status.BACKOFF;
+          if (i == 0) {
+            sinkCounter.incrementBatchEmptyCount();
+          } else {
+            sinkCounter.incrementBatchUnderflowCount();
+          }
+          break;
+        } else {
+          serializer.setEvent(event);
+          List<PutRequest> actions = serializer.getActions();
+          List<AtomicIncrementRequest> increments = serializer.getIncrements();
+          callbacksExpected.addAndGet(actions.size());
+          if (!batchIncrements) {
+            callbacksExpected.addAndGet(increments.size());
+          }
+
+          for (PutRequest action : actions) {
+            action.setDurable(enableWal);
+            client.put(action).addCallbacks(putSuccessCallback, putFailureCallback);
+          }
+          for (AtomicIncrementRequest increment : increments) {
+            if (batchIncrements) {
+              CellIdentifier identifier = new CellIdentifier(increment.key(),
+                  increment.qualifier());
+              AtomicIncrementRequest request
+                  = incrementBuffer.get(identifier);
+              if (request == null) {
+                incrementBuffer.put(identifier, increment);
+              } else {
+                request.setAmount(request.getAmount() + increment.getAmount());
+              }
+            } else {
+              client.atomicIncrement(increment).addCallbacks(
+                  incrementSuccessCallback, incrementFailureCallback);
+            }
+          }
+        }
+      }
+      if (batchIncrements) {
+        Collection<AtomicIncrementRequest> increments = incrementBuffer.values();
+        for (AtomicIncrementRequest increment : increments) {
+          client.atomicIncrement(increment).addCallbacks(
+              incrementSuccessCallback, incrementFailureCallback);
+        }
+        callbacksExpected.addAndGet(increments.size());
+      }
+      client.flush();
+    } catch (Throwable e) {
+      this.handleTransactionFailure(txn);
+      this.checkIfChannelExceptionAndThrow(e);
+    }
+    if (i == batchSize) {
+      sinkCounter.incrementBatchCompleteCount();
+    }
+    sinkCounter.addToEventDrainAttemptCount(i);
+
+    lock.lock();
+    long startTime = System.nanoTime();
+    long timeRemaining;
+    try {
+      while ((callbacksReceived.get() < callbacksExpected.get())
+          && !txnFail.get()) {
+        timeRemaining = timeout - (System.nanoTime() - startTime);
+        timeRemaining = (timeRemaining >= 0) ? timeRemaining : 0;
+        try {
+          if (!condition.await(timeRemaining, TimeUnit.NANOSECONDS)) {
+            txnFail.set(true);
+            logger.warn("HBase callbacks timed out. "
+                + "Transaction will be rolled back.");
+          }
+        } catch (Exception ex) {
+          logger.error("Exception while waiting for callbacks from HBase.");
+          this.handleTransactionFailure(txn);
+          Throwables.propagate(ex);
+        }
+      }
+    } finally {
+      lock.unlock();
+    }
+
+    if (isCoalesceTest) {
+      totalCallbacksReceived += callbacksReceived.get();
+    }
+
+    /*
+     * At this point, either the txn has failed
+     * or all callbacks received and txn is successful.
+     *
+     * This need not be in the monitor, since all callbacks for this txn
+     * have been received. So txnFail will not be modified any more(even if
+     * it is, it is set from true to true only - false happens only
+     * in the next process call).
+     *
+     */
+    if (txnFail.get()) {
+      // We enter this if condition only if the failure was due to HBase failure, so we make sure
+      // we track the consecutive failures.
+      if (lastTxnFailed) {
+        consecutiveHBaseFailures++;
+      }
+      lastTxnFailed = true;
+      this.handleTransactionFailure(txn);
+      throw new EventDeliveryException("Could not write events to Hbase. " +
+          "Transaction failed, and rolled back.");
+    } else {
+      try {
+        lastTxnFailed = false;
+        consecutiveHBaseFailures = 0;
+        txn.commit();
+        txn.close();
+        sinkCounter.addToEventDrainSuccessCount(i);
+      } catch (Throwable e) {
+        this.handleTransactionFailure(txn);
+        this.checkIfChannelExceptionAndThrow(e);
+      }
+    }
+
+    return status;
+  }
+
+  @Override
+  public void configure(Context context) {
+    tableName = context.getString(HBaseSinkConfigurationConstants.CONFIG_TABLE);
+    String cf = context.getString(
+        HBaseSinkConfigurationConstants.CONFIG_COLUMN_FAMILY);
+    batchSize = context.getLong(
+        HBaseSinkConfigurationConstants.CONFIG_BATCHSIZE, new Long(100));
+    serializerContext = new Context();
+    //If not specified, will use HBase defaults.
+    eventSerializerType = context.getString(
+        HBaseSinkConfigurationConstants.CONFIG_SERIALIZER);
+    Preconditions.checkNotNull(tableName,
+        "Table name cannot be empty, please specify in configuration file");
+    Preconditions.checkNotNull(cf,
+        "Column family cannot be empty, please specify in configuration file");
+    //Check foe event serializer, if null set event serializer type
+    if (eventSerializerType == null || eventSerializerType.isEmpty()) {
+      eventSerializerType =
+          "org.apache.flume.sink.hbase.SimpleAsyncHbaseEventSerializer";
+      logger.info("No serializer defined, Will use default");
+    }
+    serializerContext.putAll(context.getSubProperties(
+        HBaseSinkConfigurationConstants.CONFIG_SERIALIZER_PREFIX));
+    columnFamily = cf.getBytes(Charsets.UTF_8);
+    try {
+      @SuppressWarnings("unchecked")
+      Class<? extends AsyncHbaseEventSerializer> clazz =
+          (Class<? extends AsyncHbaseEventSerializer>)
+              Class.forName(eventSerializerType);
+      serializer = clazz.newInstance();
+      serializer.configure(serializerContext);
+      serializer.initialize(tableName.getBytes(Charsets.UTF_8), columnFamily);
+    } catch (Exception e) {
+      logger.error("Could not instantiate event serializer.", e);
+      Throwables.propagate(e);
+    }
+
+    if (sinkCounter == null) {
+      sinkCounter = new SinkCounter(this.getName());
+    }
+    timeout = context.getLong(HBaseSinkConfigurationConstants.CONFIG_TIMEOUT,
+        HBaseSinkConfigurationConstants.DEFAULT_TIMEOUT);
+    if (timeout <= 0) {
+      logger.warn("Timeout should be positive for Hbase sink. "
+          + "Sink will not timeout.");
+      timeout = HBaseSinkConfigurationConstants.DEFAULT_TIMEOUT;
+    }
+    //Convert to nanos.
+    timeout = TimeUnit.MILLISECONDS.toNanos(timeout);
+
+    zkQuorum = context.getString(
+        HBaseSinkConfigurationConstants.ZK_QUORUM, "").trim();
+    if (!zkQuorum.isEmpty()) {
+      zkBaseDir = context.getString(
+          HBaseSinkConfigurationConstants.ZK_ZNODE_PARENT,
+          HBaseSinkConfigurationConstants.DEFAULT_ZK_ZNODE_PARENT);
+    } else {
+      if (conf == null) { //In tests, we pass the conf in.
+        conf = HBaseConfiguration.create();
+      }
+      zkQuorum = ZKConfig.getZKQuorumServersString(conf);
+      zkBaseDir = conf.get(HConstants.ZOOKEEPER_ZNODE_PARENT,
+          HConstants.DEFAULT_ZOOKEEPER_ZNODE_PARENT);
+    }
+    Preconditions.checkState(zkQuorum != null && !zkQuorum.isEmpty(),
+        "The Zookeeper quorum cannot be null and should be specified.");
+
+    enableWal = context.getBoolean(HBaseSinkConfigurationConstants
+        .CONFIG_ENABLE_WAL, HBaseSinkConfigurationConstants.DEFAULT_ENABLE_WAL);
+    logger.info("The write to WAL option is set to: " + String.valueOf(enableWal));
+    if (!enableWal) {
+      logger.warn("AsyncHBaseSink's enableWal configuration is set to false. " +
+          "All writes to HBase will have WAL disabled, and any data in the " +
+          "memstore of this region in the Region Server could be lost!");
+    }
+
+    batchIncrements = context.getBoolean(
+        HBaseSinkConfigurationConstants.CONFIG_COALESCE_INCREMENTS,
+        HBaseSinkConfigurationConstants.DEFAULT_COALESCE_INCREMENTS);
+
+    if (batchIncrements) {
+      incrementBuffer = Maps.newHashMap();
+      logger.info("Increment coalescing is enabled. Increments will be " +
+          "buffered.");
+    }
+
+    maxConsecutiveFails =
+        context.getInteger(HBaseSinkConfigurationConstants.CONFIG_MAX_CONSECUTIVE_FAILS,
+                           HBaseSinkConfigurationConstants.DEFAULT_MAX_CONSECUTIVE_FAILS);
+
+  }
+
+  @VisibleForTesting
+  int getTotalCallbacksReceived() {
+    return totalCallbacksReceived;
+  }
+
+  @VisibleForTesting
+  boolean isConfNull() {
+    return conf == null;
+  }
+
+  @Override
+  public void start() {
+    Preconditions.checkArgument(client == null, "Please call stop "
+        + "before calling start on an old instance.");
+    sinkCounter.start();
+    sinkCounter.incrementConnectionCreatedCount();
+    client = initHBaseClient();
+    super.start();
+  }
+
+  private HBaseClient initHBaseClient() {
+    logger.info("Initializing HBase Client");
+
+    sinkCallbackPool = Executors.newCachedThreadPool(new ThreadFactoryBuilder()
+        .setNameFormat(this.getName() + " HBase Call Pool").build());
+    logger.info("Callback pool created");
+    client = new HBaseClient(zkQuorum, zkBaseDir,
+        new NioClientSocketChannelFactory(sinkCallbackPool, sinkCallbackPool));
+
+    final CountDownLatch latch = new CountDownLatch(1);
+    final AtomicBoolean fail = new AtomicBoolean(false);
+    client.ensureTableFamilyExists(
+        tableName.getBytes(Charsets.UTF_8), columnFamily).addCallbacks(
+          new Callback<Object, Object>() {
+            @Override
+            public Object call(Object arg) throws Exception {
+              latch.countDown();
+              logger.info("table found");
+              return null;
+            }
+          },
+          new Callback<Object, Object>() {
+            @Override
+            public Object call(Object arg) throws Exception {
+              fail.set(true);
+              latch.countDown();
+              return null;
+            }
+          });
+
+    try {
+      logger.info("waiting on callback");
+      latch.await();
+      logger.info("callback received");
+    } catch (InterruptedException e) {
+      sinkCounter.incrementConnectionFailedCount();
+      throw new FlumeException(
+          "Interrupted while waiting for Hbase Callbacks", e);
+    }
+    if (fail.get()) {
+      sinkCounter.incrementConnectionFailedCount();
+      if (client != null) {
+        shutdownHBaseClient();
+      }
+      throw new FlumeException(
+          "Could not start sink. " +
+              "Table or column family does not exist in Hbase.");
+    } else {
+      open = true;
+    }
+    client.setFlushInterval((short) 0);
+    return client;
+  }
+
+  @Override
+  public void stop() {
+    serializer.cleanUp();
+    if (client != null) {
+      shutdownHBaseClient();
+    }
+    sinkCounter.incrementConnectionClosedCount();
+    sinkCounter.stop();
+
+    try {
+      if (sinkCallbackPool != null) {
+        sinkCallbackPool.shutdown();
+        if (!sinkCallbackPool.awaitTermination(5, TimeUnit.SECONDS)) {
+          sinkCallbackPool.shutdownNow();
+        }
+      }
+    } catch (InterruptedException e) {
+      logger.error("Interrupted while waiting for asynchbase sink pool to " +
+          "die", e);
+      if (sinkCallbackPool != null) {
+        sinkCallbackPool.shutdownNow();
+      }
+    }
+    sinkCallbackPool = null;
+    client = null;
+    conf = null;
+    open = false;
+    super.stop();
+  }
+
+  private void shutdownHBaseClient() {
+    logger.info("Shutting down HBase Client");
+    final CountDownLatch waiter = new CountDownLatch(1);
+    try {
+      client.shutdown().addCallback(new Callback<Object, Object>() {
+        @Override
+        public Object call(Object arg) throws Exception {
+          waiter.countDown();
+          return null;
+        }
+      }).addErrback(new Callback<Object, Object>() {
+        @Override
+        public Object call(Object arg) throws Exception {
+          logger.error("Failed to shutdown HBase client cleanly! HBase cluster might be down");
+          waiter.countDown();
+          return null;
+        }
+      });
+      if (!waiter.await(timeout, TimeUnit.NANOSECONDS)) {
+        logger.error("HBase connection could not be closed within timeout! HBase cluster might " +
+            "be down!");
+      }
+    } catch (Exception ex) {
+      logger.warn("Error while attempting to close connections to HBase");
+    } finally {
+      // Dereference the client to force GC to clear up any buffered requests.
+      client = null;
+    }
+  }
+
+  private void handleTransactionFailure(Transaction txn)
+      throws EventDeliveryException {
+    if (maxConsecutiveFails > 0 && consecutiveHBaseFailures >= maxConsecutiveFails) {
+      if (client != null) {
+        shutdownHBaseClient();
+      }
+      consecutiveHBaseFailures = 0;
+    }
+    try {
+      txn.rollback();
+    } catch (Throwable e) {
+      logger.error("Failed to commit transaction." +
+          "Transaction rolled back.", e);
+      if (e instanceof Error || e instanceof RuntimeException) {
+        logger.error("Failed to commit transaction." +
+            "Transaction rolled back.", e);
+        Throwables.propagate(e);
+      } else {
+        logger.error("Failed to commit transaction." +
+            "Transaction rolled back.", e);
+        throw new EventDeliveryException("Failed to commit transaction." +
+            "Transaction rolled back.", e);
+      }
+    } finally {
+      txn.close();
+    }
+  }
+
+  private class SuccessCallback<R, T> implements Callback<R, T> {
+    private Lock lock;
+    private AtomicInteger callbacksReceived;
+    private Condition condition;
+    private final boolean isTimeoutTesting;
+
+    public SuccessCallback(Lock lck, AtomicInteger callbacksReceived,
+                           Condition condition) {
+      lock = lck;
+      this.callbacksReceived = callbacksReceived;
+      this.condition = condition;
+      isTimeoutTesting = isTimeoutTest;
+    }
+
+    @Override
+    public R call(T arg) throws Exception {
+      if (isTimeoutTesting) {
+        try {
+          //tests set timeout to 10 seconds, so sleep for 4 seconds
+          TimeUnit.NANOSECONDS.sleep(TimeUnit.SECONDS.toNanos(4));
+        } catch (InterruptedException e) {
+          //ignore
+        }
+      }
+      doCall();
+      return null;
+    }
+
+    private void doCall() throws Exception {
+      callbacksReceived.incrementAndGet();
+      lock.lock();
+      try {
+        condition.signal();
+      } finally {
+        lock.unlock();
+      }
+    }
+  }
+
+  private class FailureCallback<R, T extends Exception> implements Callback<R, T> {
+    private Lock lock;
+    private AtomicInteger callbacksReceived;
+    private AtomicBoolean txnFail;
+    private Condition condition;
+    private final boolean isTimeoutTesting;
+
+    public FailureCallback(Lock lck, AtomicInteger callbacksReceived,
+                           AtomicBoolean txnFail, Condition condition) {
+      this.lock = lck;
+      this.callbacksReceived = callbacksReceived;
+      this.txnFail = txnFail;
+      this.condition = condition;
+      isTimeoutTesting = isTimeoutTest;
+    }
+
+    @Override
+    public R call(T arg) throws Exception {
+      logger.error("failure callback:", arg);
+      if (isTimeoutTesting) {
+        //tests set timeout to 10 seconds, so sleep for 4 seconds
+        try {
+          TimeUnit.NANOSECONDS.sleep(TimeUnit.SECONDS.toNanos(4));
+        } catch (InterruptedException e) {
+          //ignore
+        }
+      }
+      doCall();
+      return null;
+    }
+
+    private void doCall() throws Exception {
+      callbacksReceived.incrementAndGet();
+      this.txnFail.set(true);
+      lock.lock();
+      try {
+        condition.signal();
+      } finally {
+        lock.unlock();
+      }
+    }
+  }
+
+  private void checkIfChannelExceptionAndThrow(Throwable e)
+      throws EventDeliveryException {
+    if (e instanceof ChannelException) {
+      throw new EventDeliveryException("Error in processing transaction.", e);
+    } else if (e instanceof Error || e instanceof RuntimeException) {
+      Throwables.propagate(e);
+    }
+    throw new EventDeliveryException("Error in processing transaction.", e);
+  }
+
+  private class CellIdentifier {
+    private final byte[] row;
+    private final byte[] column;
+    private final int hashCode;
+
+    // Since the sink operates only on one table and one cf,
+    // we use the data from the owning sink
+    public CellIdentifier(byte[] row, byte[] column) {
+      this.row = row;
+      this.column = column;
+      this.hashCode =
+          (Arrays.hashCode(row) * 31) * (Arrays.hashCode(column) * 31);
+    }
+
+    @Override
+    public int hashCode() {
+      return hashCode;
+    }
+
+    // Since we know that this class is used from only this class,
+    // skip the class comparison to save time
+    @Override
+    public boolean equals(Object other) {
+      CellIdentifier o = (CellIdentifier) other;
+      if (other == null) {
+        return false;
+      } else {
+        return (COMPARATOR.compare(row, o.row) == 0
+            && COMPARATOR.compare(column, o.column) == 0);
+      }
+    }
+  }
+}
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/AsyncHbaseEventSerializer.java b/code/flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/AsyncHbaseEventSerializer.java
new file mode 100644
index 0000000..481fce8
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/AsyncHbaseEventSerializer.java
@@ -0,0 +1,77 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.hbase;
+
+import java.util.List;
+
+import org.apache.flume.Event;
+import org.apache.flume.conf.Configurable;
+import org.apache.flume.conf.ConfigurableComponent;
+import org.hbase.async.AtomicIncrementRequest;
+import org.hbase.async.PutRequest;
+
+/**
+ * Interface for an event serializer which serializes the headers and body
+ * of an event to write them to hbase. This is configurable, so any config
+ * params required should be taken through this.
+ * The table should be valid on the column family. An implementation
+ * of this interface is expected by the {@linkplain AsyncHBaseSink} to serialize
+ * the events.
+ */
+public interface AsyncHbaseEventSerializer extends Configurable, ConfigurableComponent {
+
+  /**
+   * Initialize the event serializer.
+   * @param table - The table the serializer should use when creating
+   * {@link org.hbase.async.PutRequest} or
+   * {@link org.hbase.async.AtomicIncrementRequest}.
+   * @param cf - The column family to be used.
+   */
+  public void initialize(byte[] table, byte[] cf);
+
+  /**
+   * @param event Event to be written to HBase
+   */
+  public void setEvent(Event event);
+
+  /**
+   * Get the actions that should be written out to hbase as a result of this
+   * event. This list is written to hbase.
+   * @return List of {@link org.hbase.async.PutRequest} which
+   * are written as such to HBase.
+   *
+   *
+   */
+  public List<PutRequest> getActions();
+
+  /**
+   * Get the increments that should be made in hbase as a result of this
+   * event. This list is written to hbase.
+   * @return List of {@link org.hbase.async.AtomicIncrementRequest} which
+   * are written as such to HBase.
+   *
+   *
+   */
+  public List<AtomicIncrementRequest> getIncrements();
+
+  /**
+   * Clean up any state. This will be called when the sink is being stopped.
+   */
+  public void cleanUp();
+}
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/BatchAware.java b/code/flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/BatchAware.java
new file mode 100644
index 0000000..0974241
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/BatchAware.java
@@ -0,0 +1,28 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.hbase;
+
+/**
+ * This interface allows for implementing HBase serializers that are aware of
+ * batching. {@link #onBatchStart()} is called at the beginning of each batch
+ * by the sink.
+ */
+public interface BatchAware {
+  public void onBatchStart();
+}
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/HBaseSink.java b/code/flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/HBaseSink.java
new file mode 100644
index 0000000..4c8b52b
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/HBaseSink.java
@@ -0,0 +1,558 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.hbase;
+
+import com.google.common.annotations.VisibleForTesting;
+import com.google.common.base.Charsets;
+import com.google.common.base.Preconditions;
+import com.google.common.base.Throwables;
+import com.google.common.collect.Lists;
+import com.google.common.collect.Maps;
+import org.apache.flume.Channel;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.EventDeliveryException;
+import org.apache.flume.FlumeException;
+import org.apache.flume.Transaction;
+import org.apache.flume.annotations.InterfaceAudience;
+import org.apache.flume.auth.FlumeAuthenticationUtil;
+import org.apache.flume.auth.PrivilegedExecutor;
+import org.apache.flume.conf.Configurable;
+import org.apache.flume.instrumentation.SinkCounter;
+import org.apache.flume.sink.AbstractSink;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.hbase.HBaseConfiguration;
+import org.apache.hadoop.hbase.HConstants;
+import org.apache.hadoop.hbase.client.HTable;
+import org.apache.hadoop.hbase.client.Increment;
+import org.apache.hadoop.hbase.client.Put;
+import org.apache.hadoop.hbase.client.Row;
+import org.apache.hadoop.hbase.util.Bytes;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.lang.reflect.InvocationTargetException;
+import java.lang.reflect.Method;
+import java.security.PrivilegedExceptionAction;
+import java.util.LinkedList;
+import java.util.List;
+import java.util.Map;
+import java.util.NavigableMap;
+
+/**
+ * A simple sink which reads events from a channel and writes them to HBase.
+ * The Hbase configuration is picked up from the first <tt>hbase-site.xml</tt>
+ * encountered in the classpath. This sink supports batch reading of
+ * events from the channel, and writing them to Hbase, to minimize the number
+ * of flushes on the hbase tables. To use this sink, it has to be configured
+ * with certain mandatory parameters:<p>
+ * <tt>table: </tt> The name of the table in Hbase to write to. <p>
+ * <tt>columnFamily: </tt> The column family in Hbase to write to.<p>
+ * This sink will commit each transaction if the table's write buffer size is
+ * reached or if the number of events in the current transaction reaches the
+ * batch size, whichever comes first.<p>
+ * Other optional parameters are:<p>
+ * <tt>serializer:</tt> A class implementing {@link HbaseEventSerializer}.
+ * An instance of
+ * this class will be used to write out events to hbase.<p>
+ * <tt>serializer.*:</tt> Passed in the configure() method to serializer
+ * as an object of {@link org.apache.flume.Context}.<p>
+ * <tt>batchSize: </tt>This is the batch size used by the client. This is the
+ * maximum number of events the sink will commit per transaction. The default
+ * batch size is 100 events.
+ * <p>
+ * <p>
+ * <strong>Note: </strong> While this sink flushes all events in a transaction
+ * to HBase in one shot, Hbase does not guarantee atomic commits on multiple
+ * rows. So if a subset of events in a batch are written to disk by Hbase and
+ * Hbase fails, the flume transaction is rolled back, causing flume to write
+ * all the events in the transaction all over again, which will cause
+ * duplicates. The serializer is expected to take care of the handling of
+ * duplicates etc. HBase also does not support batch increments, so if
+ * multiple increments are returned by the serializer, then HBase failure
+ * will cause them to be re-written, when HBase comes back up.
+ */
+public class HBaseSink extends AbstractSink implements Configurable {
+  private String tableName;
+  private byte[] columnFamily;
+  private HTable table;
+  private long batchSize;
+  private Configuration config;
+  private static final Logger logger = LoggerFactory.getLogger(HBaseSink.class);
+  private HbaseEventSerializer serializer;
+  private String eventSerializerType;
+  private Context serializerContext;
+  private String kerberosPrincipal;
+  private String kerberosKeytab;
+  private boolean enableWal = true;
+  private boolean batchIncrements = false;
+  private Method refGetFamilyMap = null;
+  private SinkCounter sinkCounter;
+  private PrivilegedExecutor privilegedExecutor;
+
+  // Internal hooks used for unit testing.
+  private DebugIncrementsCallback debugIncrCallback = null;
+
+  public HBaseSink() {
+    this(HBaseConfiguration.create());
+  }
+
+  public HBaseSink(Configuration conf) {
+    this.config = conf;
+  }
+
+  @VisibleForTesting
+  @InterfaceAudience.Private
+  HBaseSink(Configuration conf, DebugIncrementsCallback cb) {
+    this(conf);
+    this.debugIncrCallback = cb;
+  }
+
+  @Override
+  public void start() {
+    Preconditions.checkArgument(table == null, "Please call stop " +
+        "before calling start on an old instance.");
+    try {
+      privilegedExecutor =
+          FlumeAuthenticationUtil.getAuthenticator(kerberosPrincipal, kerberosKeytab);
+    } catch (Exception ex) {
+      sinkCounter.incrementConnectionFailedCount();
+      throw new FlumeException("Failed to login to HBase using "
+          + "provided credentials.", ex);
+    }
+    try {
+      table = privilegedExecutor.execute(new PrivilegedExceptionAction<HTable>() {
+        @Override
+        public HTable run() throws Exception {
+          HTable table = new HTable(config, tableName);
+          table.setAutoFlush(false);
+          // Flush is controlled by us. This ensures that HBase changing
+          // their criteria for flushing does not change how we flush.
+          return table;
+        }
+      });
+    } catch (Exception e) {
+      sinkCounter.incrementConnectionFailedCount();
+      logger.error("Could not load table, " + tableName +
+          " from HBase", e);
+      throw new FlumeException("Could not load table, " + tableName +
+          " from HBase", e);
+    }
+    try {
+      if (!privilegedExecutor.execute(new PrivilegedExceptionAction<Boolean>() {
+        @Override
+        public Boolean run() throws IOException {
+          return table.getTableDescriptor().hasFamily(columnFamily);
+        }
+      })) {
+        throw new IOException("Table " + tableName
+            + " has no such column family " + Bytes.toString(columnFamily));
+      }
+    } catch (Exception e) {
+      //Get getTableDescriptor also throws IOException, so catch the IOException
+      //thrown above or by the getTableDescriptor() call.
+      sinkCounter.incrementConnectionFailedCount();
+      throw new FlumeException("Error getting column family from HBase."
+          + "Please verify that the table " + tableName + " and Column Family, "
+          + Bytes.toString(columnFamily) + " exists in HBase, and the"
+          + " current user has permissions to access that table.", e);
+    }
+
+    super.start();
+    sinkCounter.incrementConnectionCreatedCount();
+    sinkCounter.start();
+  }
+
+  @Override
+  public void stop() {
+    try {
+      if (table != null) {
+        table.close();
+      }
+      table = null;
+    } catch (IOException e) {
+      throw new FlumeException("Error closing table.", e);
+    }
+    sinkCounter.incrementConnectionClosedCount();
+    sinkCounter.stop();
+  }
+
+  @SuppressWarnings("unchecked")
+  @Override
+  public void configure(Context context) {
+    tableName = context.getString(HBaseSinkConfigurationConstants.CONFIG_TABLE);
+    String cf = context.getString(
+        HBaseSinkConfigurationConstants.CONFIG_COLUMN_FAMILY);
+    batchSize = context.getLong(
+        HBaseSinkConfigurationConstants.CONFIG_BATCHSIZE, new Long(100));
+    serializerContext = new Context();
+    //If not specified, will use HBase defaults.
+    eventSerializerType = context.getString(
+        HBaseSinkConfigurationConstants.CONFIG_SERIALIZER);
+    Preconditions.checkNotNull(tableName,
+        "Table name cannot be empty, please specify in configuration file");
+    Preconditions.checkNotNull(cf,
+        "Column family cannot be empty, please specify in configuration file");
+    //Check foe event serializer, if null set event serializer type
+    if (eventSerializerType == null || eventSerializerType.isEmpty()) {
+      eventSerializerType =
+          "org.apache.flume.sink.hbase.SimpleHbaseEventSerializer";
+      logger.info("No serializer defined, Will use default");
+    }
+    serializerContext.putAll(context.getSubProperties(
+        HBaseSinkConfigurationConstants.CONFIG_SERIALIZER_PREFIX));
+    columnFamily = cf.getBytes(Charsets.UTF_8);
+    try {
+      Class<? extends HbaseEventSerializer> clazz =
+          (Class<? extends HbaseEventSerializer>)
+              Class.forName(eventSerializerType);
+      serializer = clazz.newInstance();
+      serializer.configure(serializerContext);
+    } catch (Exception e) {
+      logger.error("Could not instantiate event serializer.", e);
+      Throwables.propagate(e);
+    }
+    kerberosKeytab = context.getString(HBaseSinkConfigurationConstants.CONFIG_KEYTAB);
+    kerberosPrincipal = context.getString(HBaseSinkConfigurationConstants.CONFIG_PRINCIPAL);
+
+    enableWal = context.getBoolean(HBaseSinkConfigurationConstants
+        .CONFIG_ENABLE_WAL, HBaseSinkConfigurationConstants.DEFAULT_ENABLE_WAL);
+    logger.info("The write to WAL option is set to: " + String.valueOf(enableWal));
+    if (!enableWal) {
+      logger.warn("HBase Sink's enableWal configuration is set to false. All " +
+          "writes to HBase will have WAL disabled, and any data in the " +
+          "memstore of this region in the Region Server could be lost!");
+    }
+
+    batchIncrements = context.getBoolean(
+        HBaseSinkConfigurationConstants.CONFIG_COALESCE_INCREMENTS,
+        HBaseSinkConfigurationConstants.DEFAULT_COALESCE_INCREMENTS);
+
+    if (batchIncrements) {
+      logger.info("Increment coalescing is enabled. Increments will be " +
+          "buffered.");
+      refGetFamilyMap = reflectLookupGetFamilyMap();
+    }
+
+    String zkQuorum = context.getString(HBaseSinkConfigurationConstants
+        .ZK_QUORUM);
+    Integer port = null;
+    /**
+     * HBase allows multiple nodes in the quorum, but all need to use the
+     * same client port. So get the nodes in host:port format,
+     * and ignore the ports for all nodes except the first one. If no port is
+     * specified, use default.
+     */
+    if (zkQuorum != null && !zkQuorum.isEmpty()) {
+      StringBuilder zkBuilder = new StringBuilder();
+      logger.info("Using ZK Quorum: " + zkQuorum);
+      String[] zkHosts = zkQuorum.split(",");
+      int length = zkHosts.length;
+      for (int i = 0; i < length; i++) {
+        String[] zkHostAndPort = zkHosts[i].split(":");
+        zkBuilder.append(zkHostAndPort[0].trim());
+        if (i != length - 1) {
+          zkBuilder.append(",");
+        } else {
+          zkQuorum = zkBuilder.toString();
+        }
+        if (zkHostAndPort[1] == null) {
+          throw new FlumeException("Expected client port for the ZK node!");
+        }
+        if (port == null) {
+          port = Integer.parseInt(zkHostAndPort[1].trim());
+        } else if (!port.equals(Integer.parseInt(zkHostAndPort[1].trim()))) {
+          throw new FlumeException("All Zookeeper nodes in the quorum must " +
+              "use the same client port.");
+        }
+      }
+      if (port == null) {
+        port = HConstants.DEFAULT_ZOOKEPER_CLIENT_PORT;
+      }
+      this.config.set(HConstants.ZOOKEEPER_QUORUM, zkQuorum);
+      this.config.setInt(HConstants.ZOOKEEPER_CLIENT_PORT, port);
+    }
+    String hbaseZnode = context.getString(
+        HBaseSinkConfigurationConstants.ZK_ZNODE_PARENT);
+    if (hbaseZnode != null && !hbaseZnode.isEmpty()) {
+      this.config.set(HConstants.ZOOKEEPER_ZNODE_PARENT, hbaseZnode);
+    }
+    sinkCounter = new SinkCounter(this.getName());
+  }
+
+  public Configuration getConfig() {
+    return config;
+  }
+
+  @Override
+  public Status process() throws EventDeliveryException {
+    Status status = Status.READY;
+    Channel channel = getChannel();
+    Transaction txn = channel.getTransaction();
+    List<Row> actions = new LinkedList<Row>();
+    List<Increment> incs = new LinkedList<Increment>();
+    try {
+      txn.begin();
+
+      if (serializer instanceof BatchAware) {
+        ((BatchAware) serializer).onBatchStart();
+      }
+
+      long i = 0;
+      for (; i < batchSize; i++) {
+        Event event = channel.take();
+        if (event == null) {
+          if (i == 0) {
+            status = Status.BACKOFF;
+            sinkCounter.incrementBatchEmptyCount();
+          } else {
+            sinkCounter.incrementBatchUnderflowCount();
+          }
+          break;
+        } else {
+          serializer.initialize(event, columnFamily);
+          actions.addAll(serializer.getActions());
+          incs.addAll(serializer.getIncrements());
+        }
+      }
+      if (i == batchSize) {
+        sinkCounter.incrementBatchCompleteCount();
+      }
+      sinkCounter.addToEventDrainAttemptCount(i);
+
+      putEventsAndCommit(actions, incs, txn);
+
+    } catch (Throwable e) {
+      try {
+        txn.rollback();
+      } catch (Exception e2) {
+        logger.error("Exception in rollback. Rollback might not have been " +
+            "successful.", e2);
+      }
+      logger.error("Failed to commit transaction." +
+          "Transaction rolled back.", e);
+      if (e instanceof Error || e instanceof RuntimeException) {
+        logger.error("Failed to commit transaction." +
+            "Transaction rolled back.", e);
+        Throwables.propagate(e);
+      } else {
+        logger.error("Failed to commit transaction." +
+            "Transaction rolled back.", e);
+        throw new EventDeliveryException("Failed to commit transaction." +
+            "Transaction rolled back.", e);
+      }
+    } finally {
+      txn.close();
+    }
+    return status;
+  }
+
+  private void putEventsAndCommit(final List<Row> actions,
+                                  final List<Increment> incs, Transaction txn) throws Exception {
+
+    privilegedExecutor.execute(new PrivilegedExceptionAction<Void>() {
+      @Override
+      public Void run() throws Exception {
+        for (Row r : actions) {
+          if (r instanceof Put) {
+            ((Put) r).setWriteToWAL(enableWal);
+          }
+          // Newer versions of HBase - Increment implements Row.
+          if (r instanceof Increment) {
+            ((Increment) r).setWriteToWAL(enableWal);
+          }
+        }
+        table.batch(actions);
+        return null;
+      }
+    });
+
+    privilegedExecutor.execute(new PrivilegedExceptionAction<Void>() {
+      @Override
+      public Void run() throws Exception {
+
+        List<Increment> processedIncrements;
+        if (batchIncrements) {
+          processedIncrements = coalesceIncrements(incs);
+        } else {
+          processedIncrements = incs;
+        }
+
+        // Only used for unit testing.
+        if (debugIncrCallback != null) {
+          debugIncrCallback.onAfterCoalesce(processedIncrements);
+        }
+
+        for (final Increment i : processedIncrements) {
+          i.setWriteToWAL(enableWal);
+          table.increment(i);
+        }
+        return null;
+      }
+    });
+
+    txn.commit();
+    sinkCounter.addToEventDrainSuccessCount(actions.size());
+  }
+
+  /**
+   * The method getFamilyMap() is no longer available in Hbase 0.96.
+   * We must use reflection to determine which version we may use.
+   */
+  @VisibleForTesting
+  static Method reflectLookupGetFamilyMap() {
+    Method m = null;
+    String[] methodNames = {"getFamilyMapOfLongs", "getFamilyMap"};
+    for (String methodName : methodNames) {
+      try {
+        m = Increment.class.getMethod(methodName);
+        if (m != null && m.getReturnType().equals(Map.class)) {
+          logger.debug("Using Increment.{} for coalesce", methodName);
+          break;
+        }
+      } catch (NoSuchMethodException e) {
+        logger.debug("Increment.{} does not exist. Exception follows.",
+            methodName, e);
+      } catch (SecurityException e) {
+        logger.debug("No access to Increment.{}; Exception follows.",
+            methodName, e);
+      }
+    }
+    if (m == null) {
+      throw new UnsupportedOperationException(
+          "Cannot find Increment.getFamilyMap()");
+    }
+    return m;
+  }
+
+  @SuppressWarnings("unchecked")
+  private Map<byte[], NavigableMap<byte[], Long>> getFamilyMap(Increment inc) {
+    Preconditions.checkNotNull(refGetFamilyMap,
+        "Increment.getFamilymap() not found");
+    Preconditions.checkNotNull(inc, "Increment required");
+    Map<byte[], NavigableMap<byte[], Long>> familyMap = null;
+    try {
+      Object familyObj = refGetFamilyMap.invoke(inc);
+      familyMap = (Map<byte[], NavigableMap<byte[], Long>>) familyObj;
+    } catch (IllegalAccessException e) {
+      logger.warn("Unexpected error calling getFamilyMap()", e);
+      Throwables.propagate(e);
+    } catch (InvocationTargetException e) {
+      logger.warn("Unexpected error calling getFamilyMap()", e);
+      Throwables.propagate(e);
+    }
+    return familyMap;
+  }
+
+  /**
+   * Perform "compression" on the given set of increments so that Flume sends
+   * the minimum possible number of RPC operations to HBase per batch.
+   *
+   * @param incs Input: Increment objects to coalesce.
+   * @return List of new Increment objects after coalescing the unique counts.
+   */
+  private List<Increment> coalesceIncrements(Iterable<Increment> incs) {
+    Preconditions.checkNotNull(incs, "List of Increments must not be null");
+    // Aggregate all of the increment row/family/column counts.
+    // The nested map is keyed like this: {row, family, qualifier} => count.
+    Map<byte[], Map<byte[], NavigableMap<byte[], Long>>> counters =
+        Maps.newTreeMap(Bytes.BYTES_COMPARATOR);
+    for (Increment inc : incs) {
+      byte[] row = inc.getRow();
+      Map<byte[], NavigableMap<byte[], Long>> families = getFamilyMap(inc);
+      for (Map.Entry<byte[], NavigableMap<byte[], Long>> familyEntry : families.entrySet()) {
+        byte[] family = familyEntry.getKey();
+        NavigableMap<byte[], Long> qualifiers = familyEntry.getValue();
+        for (Map.Entry<byte[], Long> qualifierEntry : qualifiers.entrySet()) {
+          byte[] qualifier = qualifierEntry.getKey();
+          Long count = qualifierEntry.getValue();
+          incrementCounter(counters, row, family, qualifier, count);
+        }
+      }
+    }
+
+    // Reconstruct list of Increments per unique row/family/qualifier.
+    List<Increment> coalesced = Lists.newLinkedList();
+    for (Map.Entry<byte[], Map<byte[], NavigableMap<byte[], Long>>> rowEntry :
+         counters.entrySet()) {
+      byte[] row = rowEntry.getKey();
+      Map<byte[], NavigableMap<byte[], Long>> families = rowEntry.getValue();
+      Increment inc = new Increment(row);
+      for (Map.Entry<byte[], NavigableMap<byte[], Long>> familyEntry : families.entrySet()) {
+        byte[] family = familyEntry.getKey();
+        NavigableMap<byte[], Long> qualifiers = familyEntry.getValue();
+        for (Map.Entry<byte[], Long> qualifierEntry : qualifiers.entrySet()) {
+          byte[] qualifier = qualifierEntry.getKey();
+          long count = qualifierEntry.getValue();
+          inc.addColumn(family, qualifier, count);
+        }
+      }
+      coalesced.add(inc);
+    }
+
+    return coalesced;
+  }
+
+  /**
+   * Helper function for {@link #coalesceIncrements} to increment a counter
+   * value in the passed data structure.
+   *
+   * @param counters  Nested data structure containing the counters.
+   * @param row       Row key to increment.
+   * @param family    Column family to increment.
+   * @param qualifier Column qualifier to increment.
+   * @param count     Amount to increment by.
+   */
+  private void incrementCounter(
+      Map<byte[], Map<byte[], NavigableMap<byte[], Long>>> counters,
+      byte[] row, byte[] family, byte[] qualifier, Long count) {
+
+    Map<byte[], NavigableMap<byte[], Long>> families = counters.get(row);
+    if (families == null) {
+      families = Maps.newTreeMap(Bytes.BYTES_COMPARATOR);
+      counters.put(row, families);
+    }
+
+    NavigableMap<byte[], Long> qualifiers = families.get(family);
+    if (qualifiers == null) {
+      qualifiers = Maps.newTreeMap(Bytes.BYTES_COMPARATOR);
+      families.put(family, qualifiers);
+    }
+
+    Long existingValue = qualifiers.get(qualifier);
+    if (existingValue == null) {
+      qualifiers.put(qualifier, count);
+    } else {
+      qualifiers.put(qualifier, existingValue + count);
+    }
+  }
+
+  @VisibleForTesting
+  @InterfaceAudience.Private
+  HbaseEventSerializer getSerializer() {
+    return serializer;
+  }
+
+  @VisibleForTesting
+  @InterfaceAudience.Private
+  interface DebugIncrementsCallback {
+    public void onAfterCoalesce(Iterable<Increment> increments);
+  }
+}
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/HBaseSinkConfigurationConstants.java b/code/flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/HBaseSinkConfigurationConstants.java
new file mode 100644
index 0000000..5560624
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/HBaseSinkConfigurationConstants.java
@@ -0,0 +1,77 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.hbase;
+
+import org.apache.hadoop.hbase.HConstants;
+
+/**
+ * Constants used for configuration of HBaseSink and AsyncHBaseSink
+ *
+ */
+public class HBaseSinkConfigurationConstants {
+  /**
+   * The Hbase table which the sink should write to.
+   */
+  public static final String CONFIG_TABLE = "table";
+  /**
+   * The column family which the sink should use.
+   */
+  public static final String CONFIG_COLUMN_FAMILY = "columnFamily";
+  /**
+   * Maximum number of events the sink should take from the channel per
+   * transaction, if available.
+   */
+  public static final String CONFIG_BATCHSIZE = "batchSize";
+  /**
+   * The fully qualified class name of the serializer the sink should use.
+   */
+  public static final String CONFIG_SERIALIZER = "serializer";
+  /**
+   * Configuration to pass to the serializer.
+   */
+  public static final String CONFIG_SERIALIZER_PREFIX = CONFIG_SERIALIZER + ".";
+
+  public static final String CONFIG_TIMEOUT = "timeout";
+
+  public static final String CONFIG_ENABLE_WAL = "enableWal";
+
+  public static final boolean DEFAULT_ENABLE_WAL = true;
+
+  public static final long DEFAULT_TIMEOUT = 60000;
+
+  public static final String CONFIG_KEYTAB = "kerberosKeytab";
+
+  public static final String CONFIG_PRINCIPAL = "kerberosPrincipal";
+
+  public static final String ZK_QUORUM = "zookeeperQuorum";
+
+  public static final String ZK_ZNODE_PARENT = "znodeParent";
+
+  public static final String DEFAULT_ZK_ZNODE_PARENT =
+      HConstants.DEFAULT_ZOOKEEPER_ZNODE_PARENT;
+
+  public static final String CONFIG_COALESCE_INCREMENTS = "coalesceIncrements";
+
+  public static final Boolean DEFAULT_COALESCE_INCREMENTS = false;
+
+  public static final int DEFAULT_MAX_CONSECUTIVE_FAILS = 10;
+
+  public static final String CONFIG_MAX_CONSECUTIVE_FAILS = "maxConsecutiveFails";
+
+}
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/HbaseEventSerializer.java b/code/flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/HbaseEventSerializer.java
new file mode 100644
index 0000000..d4e3f84
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/HbaseEventSerializer.java
@@ -0,0 +1,61 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.hbase;
+
+import java.util.List;
+
+import org.apache.flume.Event;
+import org.apache.flume.conf.Configurable;
+import org.apache.flume.conf.ConfigurableComponent;
+import org.apache.hadoop.hbase.client.Increment;
+import org.apache.hadoop.hbase.client.Row;
+
+/**
+ * Interface for an event serializer which serializes the headers and body
+ * of an event to write them to hbase. This is configurable, so any config
+ * params required should be taken through this. Only the column family is
+ * passed in. The columns should exist in the table and column family
+ * specified in the configuration for the HbaseSink.
+ */
+public interface HbaseEventSerializer extends Configurable, ConfigurableComponent {
+  /**
+   * Initialize the event serializer.
+   * @param event Event to be written to HBase
+   * @param columnFamily Column family to write to
+   */
+  public void initialize(Event event, byte[] columnFamily);
+
+  /**
+   * Get the actions that should be written out to hbase as a result of this
+   * event. This list is written to hbase using the HBase batch API.
+   * @return List of {@link org.apache.hadoop.hbase.client.Row} which
+   * are written as such to HBase.
+   *
+   * 0.92 increments do not implement Row, so this is not generic.
+   *
+   */
+  public List<Row> getActions();
+
+  public List<Increment> getIncrements();
+
+  /*
+   * Clean up any state. This will be called when the sink is being stopped.
+   */
+  public void close();
+}
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/KfkAsyncHbaseEventSerializer.java b/code/flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/KfkAsyncHbaseEventSerializer.java
new file mode 100644
index 0000000..126b300
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/KfkAsyncHbaseEventSerializer.java
@@ -0,0 +1,155 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.hbase;
+
+import com.google.common.base.Charsets;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.FlumeException;
+import org.apache.flume.conf.ComponentConfiguration;
+import org.apache.flume.sink.hbase.SimpleHbaseEventSerializer.KeyType;
+import org.hbase.async.AtomicIncrementRequest;
+import org.hbase.async.PutRequest;
+
+import java.util.ArrayList;
+import java.util.List;
+
+/**
+ * A simple serializer to be used with the AsyncHBaseSink
+ * that returns puts from an event, by writing the event
+ * body into it. The headers are discarded. It also updates a row in hbase
+ * which acts as an event counter.
+ *
+ * Takes optional parameters:<p>
+ * <tt>rowPrefix:</tt> The prefix to be used. Default: <i>default</i><p>
+ * <tt>incrementRow</tt> The row to increment. Default: <i>incRow</i><p>
+ * <tt>suffix:</tt> <i>uuid/random/timestamp.</i>Default: <i>uuid</i><p>
+ *
+ * Mandatory parameters: <p>
+ * <tt>cf:</tt>Column family.<p>
+ * Components that have no defaults and will not be used if absent:
+ * <tt>payloadColumn:</tt> Which column to put payload in. If it is not present,
+ * event data will not be written.<p>
+ * <tt>incrementColumn:</tt> Which column to increment. If this is absent, it
+ *  means no column is incremented.
+ */
+public class KfkAsyncHbaseEventSerializer implements AsyncHbaseEventSerializer {
+    private byte[] table;
+    private byte[] cf;
+    private byte[] payload;
+    private byte[] payloadColumn;
+    private byte[] incrementColumn;
+    private String rowPrefix;
+    private byte[] incrementRow;
+    private KeyType keyType;
+
+    @Override
+    public void initialize(byte[] table, byte[] cf) {
+        this.table = table;
+        this.cf = cf;
+    }
+
+    @Override
+    public List<PutRequest> getActions() {
+        List<PutRequest> actions = new ArrayList<>();
+        if (payloadColumn != null) {
+            byte[] rowKey;
+            try {
+                /*---------------------------代码修改开始---------------------------------*/
+                //解析列字段
+                String[] columns = new String(this.payloadColumn).split(",");
+                //解析flume采集过来的每行的值
+                String[] values = new String(this.payload).split(",");
+                for(int i=0;i < columns.length;i++) {
+                    byte[] colColumn = columns[i].getBytes();
+                    byte[] colValue = values[i].getBytes(Charsets.UTF_8);
+
+                    //数据校验：字段和值是否对应
+                    if (colColumn.length != colValue.length) break;
+
+                    //时间
+                    String datetime = values[0].toString();
+                    //用户id
+                    String userid = values[1].toString();
+                    //根据业务自定义Rowkey
+                    rowKey = SimpleRowKeyGenerator.getKfkRowKey(userid, datetime);
+                    //插入数据
+                    PutRequest putRequest = new PutRequest(table, rowKey, cf,
+                            colColumn, colValue);
+                    actions.add(putRequest);
+                    /*---------------------------代码修改结束---------------------------------*/
+                }
+            } catch (Exception e) {
+                throw new FlumeException("Could not get row key!", e);
+            }
+        }
+        return actions;
+    }
+
+    public List<AtomicIncrementRequest> getIncrements() {
+        List<AtomicIncrementRequest> actions = new ArrayList<AtomicIncrementRequest>();
+        if (incrementColumn != null) {
+            AtomicIncrementRequest inc = new AtomicIncrementRequest(table,
+                    incrementRow, cf, incrementColumn);
+            actions.add(inc);
+        }
+        return actions;
+    }
+
+    @Override
+    public void cleanUp() {
+        // TODO Auto-generated method stub
+
+    }
+
+    @Override
+    public void configure(Context context) {
+        String pCol = context.getString("payloadColumn", "pCol");
+        String iCol = context.getString("incrementColumn", "iCol");
+        rowPrefix = context.getString("rowPrefix", "default");
+        String suffix = context.getString("suffix", "uuid");
+        if (pCol != null && !pCol.isEmpty()) {
+            if (suffix.equals("timestamp")) {
+                keyType = KeyType.TS;
+            } else if (suffix.equals("random")) {
+                keyType = KeyType.RANDOM;
+            } else if (suffix.equals("nano")) {
+                keyType = KeyType.TSNANO;
+            } else {
+                keyType = KeyType.UUID;
+            }
+            payloadColumn = pCol.getBytes(Charsets.UTF_8);
+        }
+        if (iCol != null && !iCol.isEmpty()) {
+            incrementColumn = iCol.getBytes(Charsets.UTF_8);
+        }
+        incrementRow = context.getString("incrementRow", "incRow").getBytes(Charsets.UTF_8);
+    }
+
+    @Override
+    public void setEvent(Event event) {
+        this.payload = event.getBody();
+    }
+
+    @Override
+    public void configure(ComponentConfiguration conf) {
+        // TODO Auto-generated method stub
+    }
+
+}
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/RegexHbaseEventSerializer.java b/code/flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/RegexHbaseEventSerializer.java
new file mode 100644
index 0000000..8342d67
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/RegexHbaseEventSerializer.java
@@ -0,0 +1,215 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.hbase;
+
+import com.google.common.base.Charsets;
+import com.google.common.collect.Lists;
+import org.apache.commons.lang.RandomStringUtils;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.FlumeException;
+import org.apache.flume.conf.ComponentConfiguration;
+import org.apache.hadoop.hbase.client.Increment;
+import org.apache.hadoop.hbase.client.Put;
+import org.apache.hadoop.hbase.client.Row;
+
+import java.nio.charset.Charset;
+import java.util.Calendar;
+import java.util.List;
+import java.util.Map;
+import java.util.concurrent.atomic.AtomicInteger;
+import java.util.regex.Matcher;
+import java.util.regex.Pattern;
+
+/**
+ * An {@link HbaseEventSerializer} which parses columns based on a supplied
+ * regular expression and column name list.
+ * <p>
+ * Note that if the regular expression does not return the correct number of
+ * groups for a particular event, or it does not correctly match an event,
+ * the event is silently dropped.
+ * <p>
+ * Row keys for each event consist of a timestamp concatenated with an
+ * identifier which enforces uniqueness of keys across flume agents.
+ * <p>
+ * See static constant variables for configuration options.
+ */
+public class RegexHbaseEventSerializer implements HbaseEventSerializer {
+  // Config vars
+  /** Regular expression used to parse groups from event data. */
+  public static final String REGEX_CONFIG = "regex";
+  public static final String REGEX_DEFAULT = "(.*)";
+
+  /** Whether to ignore case when performing regex matches. */
+  public static final String IGNORE_CASE_CONFIG = "regexIgnoreCase";
+  public static final boolean INGORE_CASE_DEFAULT = false;
+
+  /** Comma separated list of column names to place matching groups in. */
+  public static final String COL_NAME_CONFIG = "colNames";
+  public static final String COLUMN_NAME_DEFAULT = "payload";
+
+  /** Index of the row key in matched regex groups */
+  public static final String ROW_KEY_INDEX_CONFIG = "rowKeyIndex";
+
+  /** Placeholder in colNames for row key */
+  public static final String ROW_KEY_NAME = "ROW_KEY";
+
+  /** Whether to deposit event headers into corresponding column qualifiers */
+  public static final String DEPOSIT_HEADERS_CONFIG = "depositHeaders";
+  public static final boolean DEPOSIT_HEADERS_DEFAULT = false;
+
+  /** What charset to use when serializing into HBase's byte arrays */
+  public static final String CHARSET_CONFIG = "charset";
+  public static final String CHARSET_DEFAULT = "UTF-8";
+
+  /* This is a nonce used in HBase row-keys, such that the same row-key
+   * never gets written more than once from within this JVM. */
+  protected static final AtomicInteger nonce = new AtomicInteger(0);
+  protected static String randomKey = RandomStringUtils.randomAlphanumeric(10);
+
+  protected byte[] cf;
+  private byte[] payload;
+  private List<byte[]> colNames = Lists.newArrayList();
+  private Map<String, String> headers;
+  private boolean regexIgnoreCase;
+  private boolean depositHeaders;
+  private Pattern inputPattern;
+  private Charset charset;
+  private int rowKeyIndex;
+
+  @Override
+  public void configure(Context context) {
+    String regex = context.getString(REGEX_CONFIG, REGEX_DEFAULT);
+    regexIgnoreCase = context.getBoolean(IGNORE_CASE_CONFIG,
+        INGORE_CASE_DEFAULT);
+    depositHeaders = context.getBoolean(DEPOSIT_HEADERS_CONFIG,
+        DEPOSIT_HEADERS_DEFAULT);
+    inputPattern = Pattern.compile(regex, Pattern.DOTALL
+        + (regexIgnoreCase ? Pattern.CASE_INSENSITIVE : 0));
+    charset = Charset.forName(context.getString(CHARSET_CONFIG,
+        CHARSET_DEFAULT));
+
+    String colNameStr = context.getString(COL_NAME_CONFIG, COLUMN_NAME_DEFAULT);
+    String[] columnNames = colNameStr.split(",");
+    for (String s : columnNames) {
+      colNames.add(s.getBytes(charset));
+    }
+
+    //Rowkey is optional, default is -1
+    rowKeyIndex = context.getInteger(ROW_KEY_INDEX_CONFIG, -1);
+    //if row key is being used, make sure it is specified correct
+    if (rowKeyIndex >= 0) {
+      if (rowKeyIndex >= columnNames.length) {
+        throw new IllegalArgumentException(ROW_KEY_INDEX_CONFIG + " must be " +
+            "less than num columns " + columnNames.length);
+      }
+      if (!ROW_KEY_NAME.equalsIgnoreCase(columnNames[rowKeyIndex])) {
+        throw new IllegalArgumentException("Column at " + rowKeyIndex + " must be "
+            + ROW_KEY_NAME + " and is " + columnNames[rowKeyIndex]);
+      }
+    }
+  }
+
+  @Override
+  public void configure(ComponentConfiguration conf) {
+  }
+
+  @Override
+  public void initialize(Event event, byte[] columnFamily) {
+    this.headers = event.getHeaders();
+    this.payload = event.getBody();
+    this.cf = columnFamily;
+  }
+
+  /**
+   * Returns a row-key with the following format:
+   * [time in millis]-[random key]-[nonce]
+   */
+  protected byte[] getRowKey(Calendar cal) {
+    /* NOTE: This key generation strategy has the following properties:
+     * 
+     * 1) Within a single JVM, the same row key will never be duplicated.
+     * 2) Amongst any two JVM's operating at different time periods (according
+     *    to their respective clocks), the same row key will never be 
+     *    duplicated.
+     * 3) Amongst any two JVM's operating concurrently (according to their
+     *    respective clocks), the odds of duplicating a row-key are non-zero
+     *    but infinitesimal. This would require simultaneous collision in (a) 
+     *    the timestamp (b) the respective nonce and (c) the random string.
+     *    The string is necessary since (a) and (b) could collide if a fleet
+     *    of Flume agents are restarted in tandem.
+     *    
+     *  Row-key uniqueness is important because conflicting row-keys will cause
+     *  data loss. */
+    String rowKey = String.format("%s-%s-%s", cal.getTimeInMillis(),
+        randomKey, nonce.getAndIncrement());
+    return rowKey.getBytes(charset);
+  }
+
+  protected byte[] getRowKey() {
+    return getRowKey(Calendar.getInstance());
+  }
+
+  @Override
+  public List<Row> getActions() throws FlumeException {
+    List<Row> actions = Lists.newArrayList();
+    byte[] rowKey;
+    Matcher m = inputPattern.matcher(new String(payload, charset));
+    if (!m.matches()) {
+      return Lists.newArrayList();
+    }
+
+    if (m.groupCount() != colNames.size()) {
+      return Lists.newArrayList();
+    }
+
+    try {
+      if (rowKeyIndex < 0) {
+        rowKey = getRowKey();
+      } else {
+        rowKey = m.group(rowKeyIndex + 1).getBytes(Charsets.UTF_8);
+      }
+      Put put = new Put(rowKey);
+
+      for (int i = 0; i < colNames.size(); i++) {
+        if (i != rowKeyIndex) {
+          put.add(cf, colNames.get(i), m.group(i + 1).getBytes(Charsets.UTF_8));
+        }
+      }
+      if (depositHeaders) {
+        for (Map.Entry<String, String> entry : headers.entrySet()) {
+          put.add(cf, entry.getKey().getBytes(charset), entry.getValue().getBytes(charset));
+        }
+      }
+      actions.add(put);
+    } catch (Exception e) {
+      throw new FlumeException("Could not get row key!", e);
+    }
+    return actions;
+  }
+
+  @Override
+  public List<Increment> getIncrements() {
+    return Lists.newArrayList();
+  }
+
+  @Override
+  public void close() {
+  }
+}
\ No newline at end of file
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/SimpleAsyncHbaseEventSerializer.java b/code/flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/SimpleAsyncHbaseEventSerializer.java
new file mode 100644
index 0000000..3f442e8
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/SimpleAsyncHbaseEventSerializer.java
@@ -0,0 +1,148 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.hbase;
+
+import com.google.common.base.Charsets;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.FlumeException;
+import org.apache.flume.conf.ComponentConfiguration;
+import org.apache.flume.sink.hbase.SimpleHbaseEventSerializer.KeyType;
+import org.hbase.async.AtomicIncrementRequest;
+import org.hbase.async.PutRequest;
+
+import java.util.ArrayList;
+import java.util.List;
+
+/**
+ * A simple serializer to be used with the AsyncHBaseSink
+ * that returns puts from an event, by writing the event
+ * body into it. The headers are discarded. It also updates a row in hbase
+ * which acts as an event counter.
+ *
+ * Takes optional parameters:<p>
+ * <tt>rowPrefix:</tt> The prefix to be used. Default: <i>default</i><p>
+ * <tt>incrementRow</tt> The row to increment. Default: <i>incRow</i><p>
+ * <tt>suffix:</tt> <i>uuid/random/timestamp.</i>Default: <i>uuid</i><p>
+ *
+ * Mandatory parameters: <p>
+ * <tt>cf:</tt>Column family.<p>
+ * Components that have no defaults and will not be used if absent:
+ * <tt>payloadColumn:</tt> Which column to put payload in. If it is not present,
+ * event data will not be written.<p>
+ * <tt>incrementColumn:</tt> Which column to increment. If this is absent, it
+ *  means no column is incremented.
+ */
+public class SimpleAsyncHbaseEventSerializer implements AsyncHbaseEventSerializer {
+  private byte[] table;
+  private byte[] cf;
+  private byte[] payload;
+  private byte[] payloadColumn;
+  private byte[] incrementColumn;
+  private String rowPrefix;
+  private byte[] incrementRow;
+  private KeyType keyType;
+
+  @Override
+  public void initialize(byte[] table, byte[] cf) {
+    this.table = table;
+    this.cf = cf;
+  }
+
+  @Override
+  public List<PutRequest> getActions() {
+    List<PutRequest> actions = new ArrayList<PutRequest>();
+    if (payloadColumn != null) {
+      byte[] rowKey;
+      try {
+        switch (keyType) {
+          case TS:
+            rowKey = SimpleRowKeyGenerator.getTimestampKey(rowPrefix);
+            break;
+          case TSNANO:
+            rowKey = SimpleRowKeyGenerator.getNanoTimestampKey(rowPrefix);
+            break;
+          case RANDOM:
+            rowKey = SimpleRowKeyGenerator.getRandomKey(rowPrefix);
+            break;
+          default:
+            rowKey = SimpleRowKeyGenerator.getUUIDKey(rowPrefix);
+            break;
+        }
+        PutRequest putRequest =  new PutRequest(table, rowKey, cf,
+            payloadColumn, payload);
+        actions.add(putRequest);
+      } catch (Exception e) {
+        throw new FlumeException("Could not get row key!", e);
+      }
+    }
+    return actions;
+  }
+
+  public List<AtomicIncrementRequest> getIncrements() {
+    List<AtomicIncrementRequest> actions = new ArrayList<AtomicIncrementRequest>();
+    if (incrementColumn != null) {
+      AtomicIncrementRequest inc = new AtomicIncrementRequest(table,
+          incrementRow, cf, incrementColumn);
+      actions.add(inc);
+    }
+    return actions;
+  }
+
+  @Override
+  public void cleanUp() {
+    // TODO Auto-generated method stub
+
+  }
+
+  @Override
+  public void configure(Context context) {
+    String pCol = context.getString("payloadColumn", "pCol");
+    String iCol = context.getString("incrementColumn", "iCol");
+    rowPrefix = context.getString("rowPrefix", "default");
+    String suffix = context.getString("suffix", "uuid");
+    if (pCol != null && !pCol.isEmpty()) {
+      if (suffix.equals("timestamp")) {
+        keyType = KeyType.TS;
+      } else if (suffix.equals("random")) {
+        keyType = KeyType.RANDOM;
+      } else if (suffix.equals("nano")) {
+        keyType = KeyType.TSNANO;
+      } else {
+        keyType = KeyType.UUID;
+      }
+      payloadColumn = pCol.getBytes(Charsets.UTF_8);
+    }
+    if (iCol != null && !iCol.isEmpty()) {
+      incrementColumn = iCol.getBytes(Charsets.UTF_8);
+    }
+    incrementRow = context.getString("incrementRow", "incRow").getBytes(Charsets.UTF_8);
+  }
+
+  @Override
+  public void setEvent(Event event) {
+    this.payload = event.getBody();
+  }
+
+  @Override
+  public void configure(ComponentConfiguration conf) {
+    // TODO Auto-generated method stub
+  }
+
+}
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/SimpleHbaseEventSerializer.java b/code/flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/SimpleHbaseEventSerializer.java
new file mode 100644
index 0000000..dc89fd7
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/SimpleHbaseEventSerializer.java
@@ -0,0 +1,146 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.flume.sink.hbase;
+
+import com.google.common.base.Charsets;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.FlumeException;
+import org.apache.flume.conf.ComponentConfiguration;
+import org.apache.hadoop.hbase.client.Increment;
+import org.apache.hadoop.hbase.client.Put;
+import org.apache.hadoop.hbase.client.Row;
+
+import java.util.LinkedList;
+import java.util.List;
+
+/**
+ * A simple serializer that returns puts from an event, by writing the event
+ * body into it. The headers are discarded. It also updates a row in hbase
+ * which acts as an event counter.
+ * <p>Takes optional parameters:<p>
+ * <tt>rowPrefix:</tt> The prefix to be used. Default: <i>default</i><p>
+ * <tt>incrementRow</tt> The row to increment. Default: <i>incRow</i><p>
+ * <tt>suffix:</tt> <i>uuid/random/timestamp.</i>Default: <i>uuid</i><p>
+ * <p>Mandatory parameters: <p>
+ * <tt>cf:</tt>Column family.<p>
+ * Components that have no defaults and will not be used if null:
+ * <tt>payloadColumn:</tt> Which column to put payload in. If it is null,
+ * event data will not be written.<p>
+ * <tt>incColumn:</tt> Which column to increment. Null means no column is
+ * incremented.
+ */
+public class SimpleHbaseEventSerializer implements HbaseEventSerializer {
+  private String rowPrefix;
+  private byte[] incrementRow;
+  private byte[] cf;
+  private byte[] plCol;
+  private byte[] incCol;
+  private KeyType keyType;
+  private byte[] payload;
+
+  public SimpleHbaseEventSerializer() {
+  }
+
+  @Override
+  public void configure(Context context) {
+    rowPrefix = context.getString("rowPrefix", "default");
+    incrementRow =
+        context.getString("incrementRow", "incRow").getBytes(Charsets.UTF_8);
+    String suffix = context.getString("suffix", "uuid");
+
+    String payloadColumn = context.getString("payloadColumn", "pCol");
+    String incColumn = context.getString("incrementColumn", "iCol");
+    if (payloadColumn != null && !payloadColumn.isEmpty()) {
+      if (suffix.equals("timestamp")) {
+        keyType = KeyType.TS;
+      } else if (suffix.equals("random")) {
+        keyType = KeyType.RANDOM;
+      } else if (suffix.equals("nano")) {
+        keyType = KeyType.TSNANO;
+      } else {
+        keyType = KeyType.UUID;
+      }
+      plCol = payloadColumn.getBytes(Charsets.UTF_8);
+    }
+    if (incColumn != null && !incColumn.isEmpty()) {
+      incCol = incColumn.getBytes(Charsets.UTF_8);
+    }
+  }
+
+  @Override
+  public void configure(ComponentConfiguration conf) {
+  }
+
+  @Override
+  public void initialize(Event event, byte[] cf) {
+    this.payload = event.getBody();
+    this.cf = cf;
+  }
+
+  @Override
+  public List<Row> getActions() throws FlumeException {
+    List<Row> actions = new LinkedList<Row>();
+    if (plCol != null) {
+      byte[] rowKey;
+      try {
+        if (keyType == KeyType.TS) {
+          rowKey = SimpleRowKeyGenerator.getTimestampKey(rowPrefix);
+        } else if (keyType == KeyType.RANDOM) {
+          rowKey = SimpleRowKeyGenerator.getRandomKey(rowPrefix);
+        } else if (keyType == KeyType.TSNANO) {
+          rowKey = SimpleRowKeyGenerator.getNanoTimestampKey(rowPrefix);
+        } else {
+          rowKey = SimpleRowKeyGenerator.getUUIDKey(rowPrefix);
+        }
+        Put put = new Put(rowKey);
+        put.add(cf, plCol, payload);
+        actions.add(put);
+      } catch (Exception e) {
+        throw new FlumeException("Could not get row key!", e);
+      }
+
+    }
+    return actions;
+  }
+
+  @Override
+  public List<Increment> getIncrements() {
+    List<Increment> incs = new LinkedList<Increment>();
+    if (incCol != null) {
+      Increment inc = new Increment(incrementRow);
+      inc.addColumn(cf, incCol, 1);
+      incs.add(inc);
+    }
+    return incs;
+  }
+
+  @Override
+  public void close() {
+  }
+
+  public enum KeyType {
+    UUID,
+    RANDOM,
+    TS,
+    TSNANO;
+  }
+
+}
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/SimpleRowKeyGenerator.java b/code/flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/SimpleRowKeyGenerator.java
new file mode 100644
index 0000000..0cabd2c
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/SimpleRowKeyGenerator.java
@@ -0,0 +1,50 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.hbase;
+
+import java.io.UnsupportedEncodingException;
+import java.util.Random;
+import java.util.UUID;
+
+/**
+ * Utility class for users to generate their own keys. Any key can be used,
+ * this is just a utility that provides a set of simple keys.
+ */
+public class SimpleRowKeyGenerator {
+
+  public static byte[] getUUIDKey(String prefix) throws UnsupportedEncodingException {
+    return (prefix + UUID.randomUUID().toString()).getBytes("UTF8");
+  }
+
+  public static byte[] getRandomKey(String prefix) throws UnsupportedEncodingException {
+    return (prefix + String.valueOf(new Random().nextLong())).getBytes("UTF8");
+  }
+
+  public static byte[] getTimestampKey(String prefix) throws UnsupportedEncodingException {
+    return (prefix + String.valueOf(System.currentTimeMillis())).getBytes("UTF8");
+  }
+
+  public static byte[] getNanoTimestampKey(String prefix) throws UnsupportedEncodingException {
+    return (prefix + String.valueOf(System.nanoTime())).getBytes("UTF8");
+  }
+  public static byte[] getKfkRowKey(String userid,String datetime)throws UnsupportedEncodingException {
+    return (userid + datetime + String.valueOf(System.currentTimeMillis())).getBytes("UTF8");
+  }
+
+}
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/src/test/java/org/apache/flume/sink/hbase/IncrementAsyncHBaseSerializer.java b/code/flume-ng-sinks/flume-ng-hbase-sink/src/test/java/org/apache/flume/sink/hbase/IncrementAsyncHBaseSerializer.java
new file mode 100644
index 0000000..9a2be5a
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-hbase-sink/src/test/java/org/apache/flume/sink/hbase/IncrementAsyncHBaseSerializer.java
@@ -0,0 +1,78 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.hbase;
+
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.conf.ComponentConfiguration;
+import org.hbase.async.AtomicIncrementRequest;
+import org.hbase.async.PutRequest;
+
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+
+/**
+ * An AsyncHBaseEventSerializer implementation that increments a configured
+ * column for the row whose row key is the event's body bytes.
+ */
+public class IncrementAsyncHBaseSerializer implements AsyncHbaseEventSerializer {
+  private byte[] table;
+  private byte[] cf;
+  private byte[] column;
+  private Event currentEvent;
+
+  @Override
+  public void initialize(byte[] table, byte[] cf) {
+    this.table = table;
+    this.cf = cf;
+  }
+
+  @Override
+  public void setEvent(Event event) {
+    this.currentEvent = event;
+  }
+
+  @Override
+  public List<PutRequest> getActions() {
+    return Collections.emptyList();
+  }
+
+  @Override
+  public List<AtomicIncrementRequest> getIncrements() {
+    List<AtomicIncrementRequest> incrs = new ArrayList<AtomicIncrementRequest>();
+    AtomicIncrementRequest incr = new AtomicIncrementRequest(table,
+        currentEvent.getBody(), cf, column, 1);
+    incrs.add(incr);
+    return incrs;
+  }
+
+  @Override
+  public void cleanUp() {
+  }
+
+  @Override
+  public void configure(Context context) {
+    column = context.getString("column", "col").getBytes();
+  }
+
+  @Override
+  public void configure(ComponentConfiguration conf) {
+  }
+}
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/src/test/java/org/apache/flume/sink/hbase/IncrementHBaseSerializer.java b/code/flume-ng-sinks/flume-ng-hbase-sink/src/test/java/org/apache/flume/sink/hbase/IncrementHBaseSerializer.java
new file mode 100644
index 0000000..b4343eb
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-hbase-sink/src/test/java/org/apache/flume/sink/hbase/IncrementHBaseSerializer.java
@@ -0,0 +1,80 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.hbase;
+
+import com.google.common.annotations.VisibleForTesting;
+import com.google.common.base.Charsets;
+import com.google.common.collect.Lists;
+import java.util.Collections;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.conf.ComponentConfiguration;
+import org.apache.hadoop.hbase.client.Increment;
+import org.apache.hadoop.hbase.client.Row;
+
+import java.util.List;
+
+/**
+ * For Increment-related unit tests.
+ */
+class IncrementHBaseSerializer implements HbaseEventSerializer, BatchAware {
+  private Event event;
+  private byte[] family;
+  private int numBatchesStarted = 0;
+
+  @Override public void configure(Context context) { }
+  @Override public void configure(ComponentConfiguration conf) { }
+  @Override public void close() { }
+
+  @Override
+  public void initialize(Event event, byte[] columnFamily) {
+    this.event = event;
+    this.family = columnFamily;
+  }
+
+  // This class only creates Increments.
+  @Override
+  public List<Row> getActions() {
+    return Collections.emptyList();
+  }
+
+  // Treat each Event as a String, i,e, "row:qualifier".
+  @Override
+  public List<Increment> getIncrements() {
+    List<Increment> increments = Lists.newArrayList();
+    String body = new String(event.getBody(), Charsets.UTF_8);
+    String[] pieces = body.split(":");
+    String row = pieces[0];
+    String qualifier = pieces[1];
+    Increment inc = new Increment(row.getBytes(Charsets.UTF_8));
+    inc.addColumn(family, qualifier.getBytes(Charsets.UTF_8), 1L);
+    increments.add(inc);
+    return increments;
+  }
+
+  @Override
+  public void onBatchStart() {
+    numBatchesStarted++;
+  }
+
+  @VisibleForTesting
+  public int getNumBatchesStarted() {
+    return numBatchesStarted;
+  }
+}
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/src/test/java/org/apache/flume/sink/hbase/MockSimpleHbaseEventSerializer.java b/code/flume-ng-sinks/flume-ng-hbase-sink/src/test/java/org/apache/flume/sink/hbase/MockSimpleHbaseEventSerializer.java
new file mode 100644
index 0000000..9b2a850
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-hbase-sink/src/test/java/org/apache/flume/sink/hbase/MockSimpleHbaseEventSerializer.java
@@ -0,0 +1,38 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.flume.sink.hbase;
+
+import java.util.List;
+
+import org.apache.flume.FlumeException;
+import org.apache.hadoop.hbase.client.Row;
+
+class MockSimpleHbaseEventSerializer extends SimpleHbaseEventSerializer {
+
+  public static boolean throwException = false;
+
+  @Override
+  public List<Row> getActions() throws FlumeException {
+    if (throwException) {
+      throw new FlumeException("Exception for testing");
+    }
+    return super.getActions();
+  }
+}
\ No newline at end of file
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/src/test/java/org/apache/flume/sink/hbase/TestAsyncHBaseSink.java b/code/flume-ng-sinks/flume-ng-hbase-sink/src/test/java/org/apache/flume/sink/hbase/TestAsyncHBaseSink.java
new file mode 100644
index 0000000..f8faa1e
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-hbase-sink/src/test/java/org/apache/flume/sink/hbase/TestAsyncHBaseSink.java
@@ -0,0 +1,618 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.flume.sink.hbase;
+
+import java.io.IOException;
+import java.lang.management.ManagementFactory;
+import java.lang.management.OperatingSystemMXBean;
+import java.util.Arrays;
+import java.util.HashMap;
+import java.util.Map;
+
+import org.apache.flume.Channel;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.EventDeliveryException;
+import org.apache.flume.FlumeException;
+import org.apache.flume.Transaction;
+import org.apache.flume.Sink.Status;
+import org.apache.flume.channel.MemoryChannel;
+import org.apache.flume.conf.Configurables;
+import org.apache.flume.event.EventBuilder;
+import org.apache.hadoop.hbase.HBaseTestingUtility;
+import org.apache.hadoop.hbase.HConstants;
+import org.apache.hadoop.hbase.client.HTable;
+import org.apache.hadoop.hbase.client.Result;
+import org.apache.hadoop.hbase.client.ResultScanner;
+import org.apache.hadoop.hbase.client.Scan;
+import org.apache.hadoop.hbase.util.Bytes;
+import org.apache.hadoop.hbase.zookeeper.ZKConfig;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.Ignore;
+import org.junit.Test;
+
+import com.google.common.primitives.Longs;
+import com.sun.management.UnixOperatingSystemMXBean;
+
+import org.junit.After;
+
+public class TestAsyncHBaseSink {
+  private static HBaseTestingUtility testUtility = new HBaseTestingUtility();
+
+  private static String tableName = "TestHbaseSink";
+  private static String columnFamily = "TestColumnFamily";
+  private static String inColumn = "iCol";
+  private static String plCol = "pCol";
+  private static Context ctx = new Context();
+  private static String valBase = "testing hbase sink: jham";
+  private boolean deleteTable = true;
+  private static OperatingSystemMXBean os;
+
+
+  @BeforeClass
+  public static void setUp() throws Exception {
+    testUtility.startMiniCluster();
+
+    Map<String, String> ctxMap = new HashMap<String, String>();
+    ctxMap.put("table", tableName);
+    ctxMap.put("columnFamily", columnFamily);
+    ctxMap.put("serializer",
+        "org.apache.flume.sink.hbase.SimpleAsyncHbaseEventSerializer");
+    ctxMap.put("serializer.payloadColumn", plCol);
+    ctxMap.put("serializer.incrementColumn", inColumn);
+    ctxMap.put("keep-alive", "0");
+    ctxMap.put("timeout", "10000");
+    ctx.putAll(ctxMap);
+
+    os = ManagementFactory.getOperatingSystemMXBean();
+  }
+
+  @AfterClass
+  public static void tearDown() throws Exception {
+    testUtility.shutdownMiniCluster();
+  }
+
+  @After
+  public void tearDownTest() throws Exception {
+    if (deleteTable) {
+      testUtility.deleteTable(tableName.getBytes());
+    }
+  }
+
+  @Test
+  public void testOneEventWithDefaults() throws Exception {
+    Map<String,String> ctxMap = new HashMap<String,String>();
+    ctxMap.put("table", tableName);
+    ctxMap.put("columnFamily", columnFamily);
+    ctxMap.put("serializer",
+            "org.apache.flume.sink.hbase.SimpleAsyncHbaseEventSerializer");
+    ctxMap.put("keep-alive", "0");
+    ctxMap.put("timeout", "10000");
+    Context tmpctx = new Context();
+    tmpctx.putAll(ctxMap);
+
+    testUtility.createTable(tableName.getBytes(), columnFamily.getBytes());
+    deleteTable = true;
+    AsyncHBaseSink sink = new AsyncHBaseSink(testUtility.getConfiguration());
+    Configurables.configure(sink, tmpctx);
+    Channel channel = new MemoryChannel();
+    Configurables.configure(channel, tmpctx);
+    sink.setChannel(channel);
+    sink.start();
+    Transaction tx = channel.getTransaction();
+    tx.begin();
+    Event e = EventBuilder.withBody(
+            Bytes.toBytes(valBase));
+    channel.put(e);
+    tx.commit();
+    tx.close();
+    Assert.assertFalse(sink.isConfNull());
+    sink.process();
+    sink.stop();
+    HTable table = new HTable(testUtility.getConfiguration(), tableName);
+    byte[][] results = getResults(table, 1);
+    byte[] out = results[0];
+    Assert.assertArrayEquals(e.getBody(), out);
+    out = results[1];
+    Assert.assertArrayEquals(Longs.toByteArray(1), out);
+  }
+
+  @Test
+  public void testOneEvent() throws Exception {
+    testUtility.createTable(tableName.getBytes(), columnFamily.getBytes());
+    deleteTable = true;
+    AsyncHBaseSink sink = new AsyncHBaseSink(testUtility.getConfiguration());
+    Configurables.configure(sink, ctx);
+    Channel channel = new MemoryChannel();
+    Configurables.configure(channel, ctx);
+    sink.setChannel(channel);
+    sink.start();
+    Transaction tx = channel.getTransaction();
+    tx.begin();
+    Event e = EventBuilder.withBody(
+        Bytes.toBytes(valBase));
+    channel.put(e);
+    tx.commit();
+    tx.close();
+    Assert.assertFalse(sink.isConfNull());
+    sink.process();
+    sink.stop();
+    HTable table = new HTable(testUtility.getConfiguration(), tableName);
+    byte[][] results = getResults(table, 1);
+    byte[] out = results[0];
+    Assert.assertArrayEquals(e.getBody(), out);
+    out = results[1];
+    Assert.assertArrayEquals(Longs.toByteArray(1), out);
+  }
+
+  @Test
+  public void testThreeEvents() throws Exception {
+    testUtility.createTable(tableName.getBytes(), columnFamily.getBytes());
+    deleteTable = true;
+    AsyncHBaseSink sink = new AsyncHBaseSink(testUtility.getConfiguration());
+    Configurables.configure(sink, ctx);
+    Channel channel = new MemoryChannel();
+    Configurables.configure(channel, ctx);
+    sink.setChannel(channel);
+    sink.start();
+    Transaction tx = channel.getTransaction();
+    tx.begin();
+    for (int i = 0; i < 3; i++) {
+      Event e = EventBuilder.withBody(Bytes.toBytes(valBase + "-" + i));
+      channel.put(e);
+    }
+    tx.commit();
+    tx.close();
+    Assert.assertFalse(sink.isConfNull());
+    sink.process();
+    sink.stop();
+    HTable table = new HTable(testUtility.getConfiguration(), tableName);
+    byte[][] results = getResults(table, 3);
+    byte[] out;
+    int found = 0;
+    for (int i = 0; i < 3; i++) {
+      for (int j = 0; j < 3; j++) {
+        if (Arrays.equals(results[j], Bytes.toBytes(valBase + "-" + i))) {
+          found++;
+          break;
+        }
+      }
+    }
+    Assert.assertEquals(3, found);
+    out = results[3];
+    Assert.assertArrayEquals(Longs.toByteArray(3), out);
+  }
+
+  //This will without FLUME-1842's timeout fix - but with FLUME-1842's testing
+  //oriented changes to the callback classes and using single threaded executor
+  //for tests.
+  @Test (expected = EventDeliveryException.class)
+  public void testTimeOut() throws Exception {
+    testUtility.createTable(tableName.getBytes(), columnFamily.getBytes());
+    deleteTable = true;
+    AsyncHBaseSink sink = new AsyncHBaseSink(testUtility.getConfiguration(), true, false);
+    Configurables.configure(sink, ctx);
+    Channel channel = new MemoryChannel();
+    Configurables.configure(channel, ctx);
+    sink.setChannel(channel);
+    channel.start();
+    sink.start();
+    Transaction tx = channel.getTransaction();
+    tx.begin();
+    for (int i = 0; i < 3; i++) {
+      Event e = EventBuilder.withBody(Bytes.toBytes(valBase + "-" + i));
+      channel.put(e);
+    }
+    tx.commit();
+    tx.close();
+    Assert.assertFalse(sink.isConfNull());
+    sink.process();
+    Assert.fail();
+  }
+
+  @Test
+  public void testMultipleBatches() throws Exception {
+    testUtility.createTable(tableName.getBytes(), columnFamily.getBytes());
+    deleteTable = true;
+    ctx.put("batchSize", "2");
+    AsyncHBaseSink sink = new AsyncHBaseSink(testUtility.getConfiguration());
+    Configurables.configure(sink, ctx);
+    //Reset the context to a higher batchSize
+    ctx.put("batchSize", "100");
+    Channel channel = new MemoryChannel();
+    Configurables.configure(channel, ctx);
+    sink.setChannel(channel);
+    sink.start();
+    Transaction tx = channel.getTransaction();
+    tx.begin();
+    for (int i = 0; i < 3; i++) {
+      Event e = EventBuilder.withBody(Bytes.toBytes(valBase + "-" + i));
+      channel.put(e);
+    }
+    tx.commit();
+    tx.close();
+    int count = 0;
+    Status status = Status.READY;
+    while (status != Status.BACKOFF) {
+      count++;
+      status = sink.process();
+    }
+    Assert.assertFalse(sink.isConfNull());
+    sink.stop();
+    Assert.assertEquals(2, count);
+    HTable table = new HTable(testUtility.getConfiguration(), tableName);
+    byte[][] results = getResults(table, 3);
+    byte[] out;
+    int found = 0;
+    for (int i = 0; i < 3; i++) {
+      for (int j = 0; j < 3; j++) {
+        if (Arrays.equals(results[j], Bytes.toBytes(valBase + "-" + i))) {
+          found++;
+          break;
+        }
+      }
+    }
+    Assert.assertEquals(3, found);
+    out = results[3];
+    Assert.assertArrayEquals(Longs.toByteArray(3), out);
+  }
+
+  @Test
+  public void testMultipleBatchesBatchIncrementsWithCoalescing() throws Exception {
+    doTestMultipleBatchesBatchIncrements(true);
+  }
+
+  @Test
+  public void testMultipleBatchesBatchIncrementsNoCoalescing() throws Exception {
+    doTestMultipleBatchesBatchIncrements(false);
+  }
+
+  public void doTestMultipleBatchesBatchIncrements(boolean coalesce) throws Exception {
+    testUtility.createTable(tableName.getBytes(), columnFamily.getBytes());
+    deleteTable = true;
+    AsyncHBaseSink sink = new AsyncHBaseSink(testUtility.getConfiguration(), false, true);
+    if (coalesce) {
+      ctx.put(HBaseSinkConfigurationConstants.CONFIG_COALESCE_INCREMENTS, "true");
+    }
+    ctx.put("batchSize", "2");
+    ctx.put("serializer", IncrementAsyncHBaseSerializer.class.getName());
+    ctx.put("serializer.column", "test");
+    Configurables.configure(sink, ctx);
+    //Reset the context to a higher batchSize
+    ctx.put("batchSize", "100");
+    // Restore the original serializer
+    ctx.put("serializer", SimpleAsyncHbaseEventSerializer.class.getName());
+    //Restore the no coalescing behavior
+    ctx.put(HBaseSinkConfigurationConstants.CONFIG_COALESCE_INCREMENTS,
+            "false");
+    Channel channel = new MemoryChannel();
+    Configurables.configure(channel, ctx);
+    sink.setChannel(channel);
+    sink.start();
+    Transaction tx = channel.getTransaction();
+    tx.begin();
+    for (int i = 0; i < 4; i++) {
+      for (int j = 0; j < 3; j++) {
+        Event e = EventBuilder.withBody(Bytes.toBytes(valBase + "-" + i));
+        channel.put(e);
+      }
+    }
+    tx.commit();
+    tx.close();
+    int count = 0;
+    Status status = Status.READY;
+    while (status != Status.BACKOFF) {
+      count++;
+      status = sink.process();
+    }
+    Assert.assertFalse(sink.isConfNull());
+    sink.stop();
+    Assert.assertEquals(7, count);
+    HTable table = new HTable(testUtility.getConfiguration(), tableName);
+    Scan scan = new Scan();
+    scan.addColumn(columnFamily.getBytes(), "test".getBytes());
+    scan.setStartRow(Bytes.toBytes(valBase));
+    ResultScanner rs = table.getScanner(scan);
+    int i = 0;
+    try {
+      for (Result r = rs.next(); r != null; r = rs.next()) {
+        byte[] out = r.getValue(columnFamily.getBytes(), "test".getBytes());
+        Assert.assertArrayEquals(Longs.toByteArray(3), out);
+        Assert.assertTrue(new String(r.getRow()).startsWith(valBase));
+        i++;
+      }
+    } finally {
+      rs.close();
+    }
+    Assert.assertEquals(4, i);
+    if (coalesce) {
+      Assert.assertEquals(8, sink.getTotalCallbacksReceived());
+    } else {
+      Assert.assertEquals(12, sink.getTotalCallbacksReceived());
+    }
+  }
+
+  @Test
+  public void testWithoutConfigurationObject() throws Exception {
+    testUtility.createTable(tableName.getBytes(), columnFamily.getBytes());
+    deleteTable = true;
+    ctx.put("batchSize", "2");
+    ctx.put(HBaseSinkConfigurationConstants.ZK_QUORUM,
+            ZKConfig.getZKQuorumServersString(testUtility.getConfiguration()));
+    ctx.put(HBaseSinkConfigurationConstants.ZK_ZNODE_PARENT,
+            testUtility.getConfiguration().get(HConstants.ZOOKEEPER_ZNODE_PARENT));
+    AsyncHBaseSink sink = new AsyncHBaseSink();
+    Configurables.configure(sink, ctx);
+    // Reset context to values usable by other tests.
+    ctx.put(HBaseSinkConfigurationConstants.ZK_QUORUM, null);
+    ctx.put(HBaseSinkConfigurationConstants.ZK_ZNODE_PARENT, null);
+    ctx.put("batchSize", "100");
+    Channel channel = new MemoryChannel();
+    Configurables.configure(channel, ctx);
+    sink.setChannel(channel);
+    sink.start();
+    Transaction tx = channel.getTransaction();
+    tx.begin();
+    for (int i = 0; i < 3; i++) {
+      Event e = EventBuilder.withBody(Bytes.toBytes(valBase + "-" + i));
+      channel.put(e);
+    }
+    tx.commit();
+    tx.close();
+    int count = 0;
+    Status status = Status.READY;
+    while (status != Status.BACKOFF) {
+      count++;
+      status = sink.process();
+    }
+    /*
+     * Make sure that the configuration was picked up from the context itself
+     * and not from a configuration object which was created by the sink.
+     */
+    Assert.assertTrue(sink.isConfNull());
+    sink.stop();
+    Assert.assertEquals(2, count);
+    HTable table = new HTable(testUtility.getConfiguration(), tableName);
+    byte[][] results = getResults(table, 3);
+    byte[] out;
+    int found = 0;
+    for (int i = 0; i < 3; i++) {
+      for (int j = 0; j < 3; j++) {
+        if (Arrays.equals(results[j], Bytes.toBytes(valBase + "-" + i))) {
+          found++;
+          break;
+        }
+      }
+    }
+    Assert.assertEquals(3, found);
+    out = results[3];
+    Assert.assertArrayEquals(Longs.toByteArray(3), out);
+  }
+
+  @Test(expected = FlumeException.class)
+  public void testMissingTable() throws Exception {
+    deleteTable = false;
+    ctx.put("batchSize", "2");
+    AsyncHBaseSink sink = new AsyncHBaseSink(testUtility.getConfiguration());
+    Configurables.configure(sink, ctx);
+    //Reset the context to a higher batchSize
+    ctx.put("batchSize", "100");
+    Channel channel = new MemoryChannel();
+    Configurables.configure(channel, ctx);
+    sink.setChannel(channel);
+    sink.start();
+    Transaction tx = channel.getTransaction();
+    tx.begin();
+    for (int i = 0; i < 3; i++) {
+      Event e = EventBuilder.withBody(Bytes.toBytes(valBase + "-" + i));
+      channel.put(e);
+    }
+    tx.commit();
+    tx.close();
+    sink.process();
+    Assert.assertFalse(sink.isConfNull());
+    HTable table = new HTable(testUtility.getConfiguration(), tableName);
+    byte[][] results = getResults(table, 2);
+    byte[] out;
+    int found = 0;
+    for (int i = 0; i < 2; i++) {
+      for (int j = 0; j < 2; j++) {
+        if (Arrays.equals(results[j], Bytes.toBytes(valBase + "-" + i))) {
+          found++;
+          break;
+        }
+      }
+    }
+    Assert.assertEquals(2, found);
+    out = results[2];
+    Assert.assertArrayEquals(Longs.toByteArray(2), out);
+    sink.process();
+    sink.stop();
+  }
+
+  // We only have support for getting File Descriptor count for Unix from the JDK
+  private long getOpenFileDescriptorCount() {
+    if (os instanceof UnixOperatingSystemMXBean) {
+      return ((UnixOperatingSystemMXBean) os).getOpenFileDescriptorCount();
+    } else {
+      return -1;
+    }
+  }
+
+  /*
+   * Before the fix for FLUME-2738, consistently File Descriptors were leaked with at least
+   * > 10 FDs being leaked for every single shutdown-reinitialize routine
+   * If there is a leak, then the increase in FDs should be way higher than
+   * 50 and if there is no leak, there should not be any substantial increase in
+   * FDs. This is over a set of 10 shutdown-reinitialize runs
+   * This test makes sure that there is no File Descriptor leak, by continuously
+   * failing transactions and shutting down and reinitializing the client every time
+   * and this test will fail if a leak is detected
+   */
+  @Test
+  public void testFDLeakOnShutdown() throws Exception {
+    if (getOpenFileDescriptorCount() < 0) {
+      return;
+    }
+    testUtility.createTable(tableName.getBytes(), columnFamily.getBytes());
+    deleteTable = true;
+    AsyncHBaseSink sink = new AsyncHBaseSink(testUtility.getConfiguration(),
+                                             true, false);
+    ctx.put("maxConsecutiveFails", "1");
+    Configurables.configure(sink, ctx);
+    Channel channel = new MemoryChannel();
+    Configurables.configure(channel, ctx);
+    sink.setChannel(channel);
+    channel.start();
+    sink.start();
+    Transaction tx = channel.getTransaction();
+    tx.begin();
+    for (int i = 0; i < 3; i++) {
+      Event e = EventBuilder.withBody(Bytes.toBytes(valBase + "-" + i));
+      channel.put(e);
+    }
+    tx.commit();
+    tx.close();
+    Assert.assertFalse(sink.isConfNull());
+    long initialFDCount = getOpenFileDescriptorCount();
+
+    // Since the isTimeOutTest is set to true, transaction will fail
+    // with EventDeliveryException
+    for (int i = 0; i < 10; i++) {
+      try {
+        sink.process();
+      } catch (EventDeliveryException ex) {
+      }
+    }
+    long increaseInFD = getOpenFileDescriptorCount() - initialFDCount;
+    Assert.assertTrue("File Descriptor leak detected. FDs have increased by " +
+                      increaseInFD + " from an initial FD count of " + initialFDCount,
+                      increaseInFD < 50);
+  }
+
+  /**
+   * This test must run last - it shuts down the minicluster :D
+   *
+   * @throws Exception
+   */
+  @Ignore("For dev builds only:" +
+          "This test takes too long, and this has to be run after all other" +
+          "tests, since it shuts down the minicluster. " +
+          "Comment out all other tests" +
+          "and uncomment this annotation to run this test.")
+  @Test(expected = EventDeliveryException.class)
+  public void testHBaseFailure() throws Exception {
+    ctx.put("batchSize", "2");
+    testUtility.createTable(tableName.getBytes(), columnFamily.getBytes());
+    deleteTable = false;
+    AsyncHBaseSink sink = new AsyncHBaseSink(testUtility.getConfiguration());
+    Configurables.configure(sink, ctx);
+    //Reset the context to a higher batchSize
+    ctx.put("batchSize", "100");
+    Channel channel = new MemoryChannel();
+    Configurables.configure(channel, ctx);
+    sink.setChannel(channel);
+    sink.start();
+    Transaction tx = channel.getTransaction();
+    tx.begin();
+    for (int i = 0; i < 3; i++) {
+      Event e = EventBuilder.withBody(Bytes.toBytes(valBase + "-" + i));
+      channel.put(e);
+    }
+    tx.commit();
+    tx.close();
+    sink.process();
+    Assert.assertFalse(sink.isConfNull());
+    HTable table = new HTable(testUtility.getConfiguration(), tableName);
+    byte[][] results = getResults(table, 2);
+    byte[] out;
+    int found = 0;
+    for (int i = 0; i < 2; i++) {
+      for (int j = 0; j < 2; j++) {
+        if (Arrays.equals(results[j], Bytes.toBytes(valBase + "-" + i))) {
+          found++;
+          break;
+        }
+      }
+    }
+    Assert.assertEquals(2, found);
+    out = results[2];
+    Assert.assertArrayEquals(Longs.toByteArray(2), out);
+    testUtility.shutdownMiniCluster();
+    sink.process();
+    sink.stop();
+  }
+
+  /**
+   * Makes Hbase scans to get rows in the payload column and increment column
+   * in the table given. Expensive, so tread lightly.
+   * Calling this function multiple times for the same result set is a bad
+   * idea. Cache the result set once it is returned by this function.
+   *
+   * @param table
+   * @param numEvents Number of events inserted into the table
+   * @return
+   * @throws IOException
+   */
+  private byte[][] getResults(HTable table, int numEvents) throws IOException {
+    byte[][] results = new byte[numEvents + 1][];
+    Scan scan = new Scan();
+    scan.addColumn(columnFamily.getBytes(), plCol.getBytes());
+    scan.setStartRow(Bytes.toBytes("default"));
+    ResultScanner rs = table.getScanner(scan);
+    byte[] out = null;
+    int i = 0;
+    try {
+      for (Result r = rs.next(); r != null; r = rs.next()) {
+        out = r.getValue(columnFamily.getBytes(), plCol.getBytes());
+
+        if (i >= results.length - 1) {
+          rs.close();
+          throw new FlumeException("More results than expected in the table." +
+                                   "Expected = " + numEvents + ". Found = " + i);
+        }
+        results[i++] = out;
+        System.out.println(out);
+      }
+    } finally {
+      rs.close();
+    }
+
+    Assert.assertEquals(i, results.length - 1);
+    scan = new Scan();
+    scan.addColumn(columnFamily.getBytes(), inColumn.getBytes());
+    scan.setStartRow(Bytes.toBytes("incRow"));
+    rs = table.getScanner(scan);
+    out = null;
+    try {
+      for (Result r = rs.next(); r != null; r = rs.next()) {
+        out = r.getValue(columnFamily.getBytes(), inColumn.getBytes());
+        results[i++] = out;
+        System.out.println(out);
+      }
+    } finally {
+      rs.close();
+    }
+    return results;
+  }
+}
+
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/src/test/java/org/apache/flume/sink/hbase/TestHBaseSink.java b/code/flume-ng-sinks/flume-ng-hbase-sink/src/test/java/org/apache/flume/sink/hbase/TestHBaseSink.java
new file mode 100644
index 0000000..217913b
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-hbase-sink/src/test/java/org/apache/flume/sink/hbase/TestHBaseSink.java
@@ -0,0 +1,744 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.hbase;
+
+import com.google.common.base.Charsets;
+import com.google.common.base.Throwables;
+import com.google.common.collect.Lists;
+import com.google.common.collect.Maps;
+import com.google.common.primitives.Longs;
+import org.apache.flume.Channel;
+import org.apache.flume.ChannelException;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.EventDeliveryException;
+import org.apache.flume.FlumeException;
+import org.apache.flume.Sink.Status;
+import org.apache.flume.Transaction;
+import org.apache.flume.channel.MemoryChannel;
+import org.apache.flume.conf.Configurables;
+import org.apache.flume.event.EventBuilder;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.hbase.HBaseTestingUtility;
+import org.apache.hadoop.hbase.HConstants;
+import org.apache.hadoop.hbase.client.HTable;
+import org.apache.hadoop.hbase.client.Increment;
+import org.apache.hadoop.hbase.client.Result;
+import org.apache.hadoop.hbase.client.ResultScanner;
+import org.apache.hadoop.hbase.client.Scan;
+import org.apache.hadoop.hbase.util.Bytes;
+import org.apache.hadoop.hbase.zookeeper.ZKConfig;
+import org.junit.After;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.BeforeClass;
+import org.junit.Ignore;
+import org.junit.Test;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.lang.reflect.Method;
+import java.util.Arrays;
+import java.util.List;
+import java.util.Map;
+import java.util.NavigableMap;
+
+import static org.mockito.Mockito.doReturn;
+import static org.mockito.Mockito.doThrow;
+import static org.mockito.Mockito.spy;
+
+public class TestHBaseSink {
+  private static final Logger logger =
+      LoggerFactory.getLogger(TestHBaseSink.class);
+
+  private static final HBaseTestingUtility testUtility = new HBaseTestingUtility();
+  private static final String tableName = "TestHbaseSink";
+  private static final String columnFamily = "TestColumnFamily";
+  private static final String inColumn = "iCol";
+  private static final String plCol = "pCol";
+  private static final String valBase = "testing hbase sink: jham";
+
+  private Configuration conf;
+  private Context ctx;
+
+  @BeforeClass
+  public static void setUpOnce() throws Exception {
+    testUtility.startMiniCluster();
+  }
+
+  @AfterClass
+  public static void tearDownOnce() throws Exception {
+    testUtility.shutdownMiniCluster();
+  }
+
+  /**
+   * Most common context setup for unit tests using
+   * {@link SimpleHbaseEventSerializer}.
+   */
+  @Before
+  public void setUp() throws IOException {
+    conf = new Configuration(testUtility.getConfiguration());
+    ctx = new Context();
+    testUtility.createTable(tableName.getBytes(), columnFamily.getBytes());
+  }
+
+  @After
+  public void tearDown() throws IOException {
+    testUtility.deleteTable(tableName.getBytes());
+  }
+
+  /**
+   * Set up {@link Context} for use with {@link SimpleHbaseEventSerializer}.
+   */
+  private void initContextForSimpleHbaseEventSerializer() {
+    ctx = new Context();
+    ctx.put("table", tableName);
+    ctx.put("columnFamily", columnFamily);
+    ctx.put("serializer", SimpleHbaseEventSerializer.class.getName());
+    ctx.put("serializer.payloadColumn", plCol);
+    ctx.put("serializer.incrementColumn", inColumn);
+  }
+
+  /**
+   * Set up {@link Context} for use with {@link IncrementHBaseSerializer}.
+   */
+  private void initContextForIncrementHBaseSerializer() {
+    ctx = new Context();
+    ctx.put("table", tableName);
+    ctx.put("columnFamily", columnFamily);
+    ctx.put("serializer", IncrementHBaseSerializer.class.getName());
+  }
+
+  @Test
+  public void testOneEventWithDefaults() throws Exception {
+    //Create a context without setting increment column and payload Column
+    ctx = new Context();
+    ctx.put("table", tableName);
+    ctx.put("columnFamily", columnFamily);
+    ctx.put("serializer", SimpleHbaseEventSerializer.class.getName());
+
+    HBaseSink sink = new HBaseSink(conf);
+    Configurables.configure(sink, ctx);
+    Channel channel = new MemoryChannel();
+    Configurables.configure(channel, new Context());
+    sink.setChannel(channel);
+    sink.start();
+    Transaction tx = channel.getTransaction();
+    tx.begin();
+    Event e = EventBuilder.withBody(Bytes.toBytes(valBase));
+    channel.put(e);
+    tx.commit();
+    tx.close();
+
+    sink.process();
+    sink.stop();
+    HTable table = new HTable(conf, tableName);
+    byte[][] results = getResults(table, 1);
+    byte[] out = results[0];
+    Assert.assertArrayEquals(e.getBody(), out);
+    out = results[1];
+    Assert.assertArrayEquals(Longs.toByteArray(1), out);
+  }
+
+  @Test
+  public void testOneEvent() throws Exception {
+    initContextForSimpleHbaseEventSerializer();
+    HBaseSink sink = new HBaseSink(conf);
+    Configurables.configure(sink, ctx);
+    Channel channel = new MemoryChannel();
+    Configurables.configure(channel, new Context());
+    sink.setChannel(channel);
+    sink.start();
+    Transaction tx = channel.getTransaction();
+    tx.begin();
+    Event e = EventBuilder.withBody(
+        Bytes.toBytes(valBase));
+    channel.put(e);
+    tx.commit();
+    tx.close();
+
+    sink.process();
+    sink.stop();
+    HTable table = new HTable(conf, tableName);
+    byte[][] results = getResults(table, 1);
+    byte[] out = results[0];
+    Assert.assertArrayEquals(e.getBody(), out);
+    out = results[1];
+    Assert.assertArrayEquals(Longs.toByteArray(1), out);
+  }
+
+  @Test
+  public void testThreeEvents() throws Exception {
+    initContextForSimpleHbaseEventSerializer();
+    ctx.put("batchSize", "3");
+    HBaseSink sink = new HBaseSink(conf);
+    Configurables.configure(sink, ctx);
+    Channel channel = new MemoryChannel();
+    Configurables.configure(channel, new Context());
+    sink.setChannel(channel);
+    sink.start();
+    Transaction tx = channel.getTransaction();
+    tx.begin();
+    for (int i = 0; i < 3; i++) {
+      Event e = EventBuilder.withBody(Bytes.toBytes(valBase + "-" + i));
+      channel.put(e);
+    }
+    tx.commit();
+    tx.close();
+    sink.process();
+    sink.stop();
+    HTable table = new HTable(conf, tableName);
+    byte[][] results = getResults(table, 3);
+    byte[] out;
+    int found = 0;
+    for (int i = 0; i < 3; i++) {
+      for (int j = 0; j < 3; j++) {
+        if (Arrays.equals(results[j], Bytes.toBytes(valBase + "-" + i))) {
+          found++;
+          break;
+        }
+      }
+    }
+    Assert.assertEquals(3, found);
+    out = results[3];
+    Assert.assertArrayEquals(Longs.toByteArray(3), out);
+  }
+
+  @Test
+  public void testMultipleBatches() throws Exception {
+    initContextForSimpleHbaseEventSerializer();
+    ctx.put("batchSize", "2");
+    HBaseSink sink = new HBaseSink(conf);
+    Configurables.configure(sink, ctx);
+    //Reset the context to a higher batchSize
+    ctx.put("batchSize", "100");
+    Channel channel = new MemoryChannel();
+    Configurables.configure(channel, new Context());
+    sink.setChannel(channel);
+    sink.start();
+    Transaction tx = channel.getTransaction();
+    tx.begin();
+    for (int i = 0; i < 3; i++) {
+      Event e = EventBuilder.withBody(Bytes.toBytes(valBase + "-" + i));
+      channel.put(e);
+    }
+    tx.commit();
+    tx.close();
+    int count = 0;
+    while (sink.process() != Status.BACKOFF) {
+      count++;
+    }
+    sink.stop();
+    Assert.assertEquals(2, count);
+    HTable table = new HTable(conf, tableName);
+    byte[][] results = getResults(table, 3);
+    byte[] out;
+    int found = 0;
+    for (int i = 0; i < 3; i++) {
+      for (int j = 0; j < 3; j++) {
+        if (Arrays.equals(results[j], Bytes.toBytes(valBase + "-" + i))) {
+          found++;
+          break;
+        }
+      }
+    }
+    Assert.assertEquals(3, found);
+    out = results[3];
+    Assert.assertArrayEquals(Longs.toByteArray(3), out);
+  }
+
+  @Test(expected = FlumeException.class)
+  public void testMissingTable() throws Exception {
+    logger.info("Running testMissingTable()");
+    initContextForSimpleHbaseEventSerializer();
+
+    // setUp() will create the table, so we delete it.
+    logger.info("Deleting table {}", tableName);
+    testUtility.deleteTable(tableName.getBytes());
+
+    ctx.put("batchSize", "2");
+    HBaseSink sink = new HBaseSink(conf);
+    Configurables.configure(sink, ctx);
+    //Reset the context to a higher batchSize
+    ctx.put("batchSize", "100");
+    Channel channel = new MemoryChannel();
+    Configurables.configure(channel, new Context());
+    sink.setChannel(channel);
+
+    logger.info("Writing data into channel");
+    Transaction tx = channel.getTransaction();
+    tx.begin();
+    for (int i = 0; i < 3; i++) {
+      Event e = EventBuilder.withBody(Bytes.toBytes(valBase + "-" + i));
+      channel.put(e);
+    }
+    tx.commit();
+    tx.close();
+
+    logger.info("Starting sink and processing events");
+    try {
+      logger.info("Calling sink.start()");
+      sink.start(); // This method will throw.
+
+      // We never get here, but we log in case the behavior changes.
+      logger.error("Unexpected error: Calling sink.process()");
+      sink.process();
+      logger.error("Unexpected error: Calling sink.stop()");
+      sink.stop();
+    } finally {
+      // Re-create the table so tearDown() doesn't throw.
+      testUtility.createTable(tableName.getBytes(), columnFamily.getBytes());
+    }
+
+    // FIXME: The test should never get here, the below code doesn't run.
+    Assert.fail();
+
+    HTable table = new HTable(conf, tableName);
+    byte[][] results = getResults(table, 2);
+    byte[] out;
+    int found = 0;
+    for (int i = 0; i < 2; i++) {
+      for (int j = 0; j < 2; j++) {
+        if (Arrays.equals(results[j], Bytes.toBytes(valBase + "-" + i))) {
+          found++;
+          break;
+        }
+      }
+    }
+    Assert.assertEquals(2, found);
+    out = results[2];
+    Assert.assertArrayEquals(Longs.toByteArray(2), out);
+    sink.process();
+  }
+
+  // TODO: Move this test to a different class and run it stand-alone.
+
+  /**
+   * This test must run last - it shuts down the minicluster :D
+   *
+   * @throws Exception
+   */
+  @Ignore("For dev builds only:" +
+          "This test takes too long, and this has to be run after all other" +
+          "tests, since it shuts down the minicluster. " +
+          "Comment out all other tests" +
+          "and uncomment this annotation to run this test.")
+  @Test(expected = EventDeliveryException.class)
+  public void testHBaseFailure() throws Exception {
+    initContextForSimpleHbaseEventSerializer();
+    ctx.put("batchSize", "2");
+    HBaseSink sink = new HBaseSink(conf);
+    Configurables.configure(sink, ctx);
+    //Reset the context to a higher batchSize
+    ctx.put("batchSize", "100");
+    Channel channel = new MemoryChannel();
+    Configurables.configure(channel, new Context());
+    sink.setChannel(channel);
+    sink.start();
+    Transaction tx = channel.getTransaction();
+    tx.begin();
+    for (int i = 0; i < 3; i++) {
+      Event e = EventBuilder.withBody(Bytes.toBytes(valBase + "-" + i));
+      channel.put(e);
+    }
+    tx.commit();
+    tx.close();
+    sink.process();
+    HTable table = new HTable(conf, tableName);
+    byte[][] results = getResults(table, 2);
+    byte[] out;
+    int found = 0;
+    for (int i = 0; i < 2; i++) {
+      for (int j = 0; j < 2; j++) {
+        if (Arrays.equals(results[j], Bytes.toBytes(valBase + "-" + i))) {
+          found++;
+          break;
+        }
+      }
+    }
+    Assert.assertEquals(2, found);
+    out = results[2];
+    Assert.assertArrayEquals(Longs.toByteArray(2), out);
+    testUtility.shutdownMiniCluster();
+    sink.process();
+    sink.stop();
+  }
+
+  /**
+   * Makes Hbase scans to get rows in the payload column and increment column
+   * in the table given. Expensive, so tread lightly.
+   * Calling this function multiple times for the same result set is a bad
+   * idea. Cache the result set once it is returned by this function.
+   *
+   * @param table
+   * @param numEvents Number of events inserted into the table
+   * @return
+   * @throws IOException
+   */
+  private byte[][] getResults(HTable table, int numEvents) throws IOException {
+    byte[][] results = new byte[numEvents + 1][];
+    Scan scan = new Scan();
+    scan.addColumn(columnFamily.getBytes(), plCol.getBytes());
+    scan.setStartRow(Bytes.toBytes("default"));
+    ResultScanner rs = table.getScanner(scan);
+    byte[] out = null;
+    int i = 0;
+    try {
+      for (Result r = rs.next(); r != null; r = rs.next()) {
+        out = r.getValue(columnFamily.getBytes(), plCol.getBytes());
+
+        if (i >= results.length - 1) {
+          rs.close();
+          throw new FlumeException("More results than expected in the table." +
+                                   "Expected = " + numEvents + ". Found = " + i);
+        }
+        results[i++] = out;
+        System.out.println(out);
+      }
+    } finally {
+      rs.close();
+    }
+
+    Assert.assertEquals(i, results.length - 1);
+    scan = new Scan();
+    scan.addColumn(columnFamily.getBytes(), inColumn.getBytes());
+    scan.setStartRow(Bytes.toBytes("incRow"));
+    rs = table.getScanner(scan);
+    out = null;
+    try {
+      for (Result r = rs.next(); r != null; r = rs.next()) {
+        out = r.getValue(columnFamily.getBytes(), inColumn.getBytes());
+        results[i++] = out;
+        System.out.println(out);
+      }
+    } finally {
+      rs.close();
+    }
+    return results;
+  }
+
+  @Test
+  public void testTransactionStateOnChannelException() throws Exception {
+    initContextForSimpleHbaseEventSerializer();
+    ctx.put("batchSize", "1");
+
+    HBaseSink sink = new HBaseSink(conf);
+    Configurables.configure(sink, ctx);
+    // Reset the context to a higher batchSize
+    Channel channel = spy(new MemoryChannel());
+    Configurables.configure(channel, new Context());
+    sink.setChannel(channel);
+    sink.start();
+    Transaction tx = channel.getTransaction();
+    tx.begin();
+    Event e = EventBuilder.withBody(Bytes.toBytes(valBase + "-" + 0));
+    channel.put(e);
+    tx.commit();
+    tx.close();
+    doThrow(new ChannelException("Mock Exception")).when(channel).take();
+    try {
+      sink.process();
+      Assert.fail("take() method should throw exception");
+    } catch (ChannelException ex) {
+      Assert.assertEquals("Mock Exception", ex.getMessage());
+    }
+    doReturn(e).when(channel).take();
+    sink.process();
+    sink.stop();
+    HTable table = new HTable(conf, tableName);
+    byte[][] results = getResults(table, 1);
+    byte[] out = results[0];
+    Assert.assertArrayEquals(e.getBody(), out);
+    out = results[1];
+    Assert.assertArrayEquals(Longs.toByteArray(1), out);
+  }
+
+  @Test
+  public void testTransactionStateOnSerializationException() throws Exception {
+    initContextForSimpleHbaseEventSerializer();
+    ctx.put("batchSize", "1");
+    ctx.put(HBaseSinkConfigurationConstants.CONFIG_SERIALIZER,
+            "org.apache.flume.sink.hbase.MockSimpleHbaseEventSerializer");
+
+    HBaseSink sink = new HBaseSink(conf);
+    Configurables.configure(sink, ctx);
+    // Reset the context to a higher batchSize
+    ctx.put("batchSize", "100");
+    Channel channel = new MemoryChannel();
+    Configurables.configure(channel, new Context());
+    sink.setChannel(channel);
+    sink.start();
+    Transaction tx = channel.getTransaction();
+    tx.begin();
+    Event e = EventBuilder.withBody(Bytes.toBytes(valBase + "-" + 0));
+    channel.put(e);
+    tx.commit();
+    tx.close();
+    try {
+      MockSimpleHbaseEventSerializer.throwException = true;
+      sink.process();
+      Assert.fail("FlumeException expected from serilazer");
+    } catch (FlumeException ex) {
+      Assert.assertEquals("Exception for testing", ex.getMessage());
+    }
+    MockSimpleHbaseEventSerializer.throwException = false;
+    sink.process();
+    sink.stop();
+    HTable table = new HTable(conf, tableName);
+    byte[][] results = getResults(table, 1);
+    byte[] out = results[0];
+    Assert.assertArrayEquals(e.getBody(), out);
+    out = results[1];
+    Assert.assertArrayEquals(Longs.toByteArray(1), out);
+  }
+
+  @Test
+  public void testWithoutConfigurationObject() throws Exception {
+    initContextForSimpleHbaseEventSerializer();
+    Context tmpContext = new Context(ctx.getParameters());
+    tmpContext.put("batchSize", "2");
+    tmpContext.put(HBaseSinkConfigurationConstants.ZK_QUORUM,
+                   ZKConfig.getZKQuorumServersString(conf));
+    System.out.print(ctx.getString(HBaseSinkConfigurationConstants.ZK_QUORUM));
+    tmpContext.put(HBaseSinkConfigurationConstants.ZK_ZNODE_PARENT,
+                   conf.get(HConstants.ZOOKEEPER_ZNODE_PARENT,
+                            HConstants.DEFAULT_ZOOKEEPER_ZNODE_PARENT));
+
+    HBaseSink sink = new HBaseSink();
+    Configurables.configure(sink, tmpContext);
+    Channel channel = new MemoryChannel();
+    Configurables.configure(channel, ctx);
+    sink.setChannel(channel);
+    sink.start();
+    Transaction tx = channel.getTransaction();
+    tx.begin();
+    for (int i = 0; i < 3; i++) {
+      Event e = EventBuilder.withBody(Bytes.toBytes(valBase + "-" + i));
+      channel.put(e);
+    }
+    tx.commit();
+    tx.close();
+    Status status = Status.READY;
+    while (status != Status.BACKOFF) {
+      status = sink.process();
+    }
+    sink.stop();
+    HTable table = new HTable(conf, tableName);
+    byte[][] results = getResults(table, 3);
+    byte[] out;
+    int found = 0;
+    for (int i = 0; i < 3; i++) {
+      for (int j = 0; j < 3; j++) {
+        if (Arrays.equals(results[j], Bytes.toBytes(valBase + "-" + i))) {
+          found++;
+          break;
+        }
+      }
+    }
+    Assert.assertEquals(3, found);
+    out = results[3];
+    Assert.assertArrayEquals(Longs.toByteArray(3), out);
+  }
+
+  @Test
+  public void testZKQuorum() throws Exception {
+    initContextForSimpleHbaseEventSerializer();
+    Context tmpContext = new Context(ctx.getParameters());
+    String zkQuorum = "zk1.flume.apache.org:3342, zk2.flume.apache.org:3342, " +
+                      "zk3.flume.apache.org:3342";
+    tmpContext.put("batchSize", "2");
+    tmpContext.put(HBaseSinkConfigurationConstants.ZK_QUORUM, zkQuorum);
+    tmpContext.put(HBaseSinkConfigurationConstants.ZK_ZNODE_PARENT,
+                   conf.get(HConstants.ZOOKEEPER_ZNODE_PARENT,
+                            HConstants.DEFAULT_ZOOKEEPER_ZNODE_PARENT));
+    HBaseSink sink = new HBaseSink();
+    Configurables.configure(sink, tmpContext);
+    Assert.assertEquals("zk1.flume.apache.org,zk2.flume.apache.org," +
+                        "zk3.flume.apache.org", sink.getConfig().get(HConstants.ZOOKEEPER_QUORUM));
+    Assert.assertEquals(String.valueOf(3342),
+                        sink.getConfig().get(HConstants.ZOOKEEPER_CLIENT_PORT));
+  }
+
+  @Test(expected = FlumeException.class)
+  public void testZKQuorumIncorrectPorts() throws Exception {
+    initContextForSimpleHbaseEventSerializer();
+    Context tmpContext = new Context(ctx.getParameters());
+
+    String zkQuorum = "zk1.flume.apache.org:3345, zk2.flume.apache.org:3342, " +
+                      "zk3.flume.apache.org:3342";
+    tmpContext.put("batchSize", "2");
+    tmpContext.put(HBaseSinkConfigurationConstants.ZK_QUORUM, zkQuorum);
+    tmpContext.put(HBaseSinkConfigurationConstants.ZK_ZNODE_PARENT,
+                   conf.get(HConstants.ZOOKEEPER_ZNODE_PARENT,
+                            HConstants.DEFAULT_ZOOKEEPER_ZNODE_PARENT));
+    HBaseSink sink = new HBaseSink();
+    Configurables.configure(sink, tmpContext);
+    Assert.fail();
+  }
+
+  @Test
+  public void testCoalesce() throws EventDeliveryException {
+    initContextForIncrementHBaseSerializer();
+    ctx.put("batchSize", "100");
+    ctx.put(HBaseSinkConfigurationConstants.CONFIG_COALESCE_INCREMENTS,
+        String.valueOf(true));
+
+    final Map<String, Long> expectedCounts = Maps.newHashMap();
+    expectedCounts.put("r1:c1", 10L);
+    expectedCounts.put("r1:c2", 20L);
+    expectedCounts.put("r2:c1", 7L);
+    expectedCounts.put("r2:c3", 63L);
+    HBaseSink.DebugIncrementsCallback cb = new CoalesceValidator(expectedCounts);
+
+    HBaseSink sink = new HBaseSink(testUtility.getConfiguration(), cb);
+    Configurables.configure(sink, ctx);
+    Channel channel = createAndConfigureMemoryChannel(sink);
+
+    List<Event> events = Lists.newLinkedList();
+    generateEvents(events, expectedCounts);
+    putEvents(channel, events);
+
+    sink.start();
+    sink.process(); // Calls CoalesceValidator instance.
+    sink.stop();
+  }
+
+  @Test(expected = AssertionError.class)
+  public void negativeTestCoalesce() throws EventDeliveryException {
+    initContextForIncrementHBaseSerializer();
+    ctx.put("batchSize", "10");
+
+    final Map<String, Long> expectedCounts = Maps.newHashMap();
+    expectedCounts.put("r1:c1", 10L);
+    HBaseSink.DebugIncrementsCallback cb = new CoalesceValidator(expectedCounts);
+
+    HBaseSink sink = new HBaseSink(testUtility.getConfiguration(), cb);
+    Configurables.configure(sink, ctx);
+    Channel channel = createAndConfigureMemoryChannel(sink);
+
+    List<Event> events = Lists.newLinkedList();
+    generateEvents(events, expectedCounts);
+    putEvents(channel, events);
+
+    sink.start();
+    sink.process(); // Calls CoalesceValidator instance.
+    sink.stop();
+  }
+
+  @Test
+  public void testBatchAware() throws EventDeliveryException {
+    logger.info("Running testBatchAware()");
+    initContextForIncrementHBaseSerializer();
+    HBaseSink sink = new HBaseSink(testUtility.getConfiguration());
+    Configurables.configure(sink, ctx);
+    Channel channel = createAndConfigureMemoryChannel(sink);
+
+    sink.start();
+    int batchCount = 3;
+    for (int i = 0; i < batchCount; i++) {
+      sink.process();
+    }
+    sink.stop();
+    Assert.assertEquals(batchCount,
+        ((IncrementHBaseSerializer) sink.getSerializer()).getNumBatchesStarted());
+  }
+
+  /**
+   * For testing that the rows coalesced, serialized by
+   * {@link IncrementHBaseSerializer}, are of the expected batch size.
+   */
+  private static class CoalesceValidator
+      implements HBaseSink.DebugIncrementsCallback {
+
+    private final Map<String,Long> expectedCounts;
+    private final Method refGetFamilyMap;
+
+    public CoalesceValidator(Map<String, Long> expectedCounts) {
+      this.expectedCounts = expectedCounts;
+      this.refGetFamilyMap = HBaseSink.reflectLookupGetFamilyMap();
+    }
+
+    @Override
+    @SuppressWarnings("unchecked")
+    public void onAfterCoalesce(Iterable<Increment> increments) {
+      for (Increment inc : increments) {
+        byte[] row = inc.getRow();
+        Map<byte[], NavigableMap<byte[], Long>> families = null;
+        try {
+          families = (Map<byte[], NavigableMap<byte[], Long>>)
+              refGetFamilyMap.invoke(inc);
+        } catch (Exception e) {
+          Throwables.propagate(e);
+        }
+        for (byte[] family : families.keySet()) {
+          NavigableMap<byte[], Long> qualifiers = families.get(family);
+          for (Map.Entry<byte[], Long> entry : qualifiers.entrySet()) {
+            byte[] qualifier = entry.getKey();
+            Long count = entry.getValue();
+            StringBuilder b = new StringBuilder(20);
+            b.append(new String(row, Charsets.UTF_8));
+            b.append(':');
+            b.append(new String(qualifier, Charsets.UTF_8));
+            String key = b.toString();
+            Assert.assertEquals("Expected counts don't match observed for " + key,
+                expectedCounts.get(key), count);
+          }
+        }
+      }
+    }
+  }
+
+  /**
+   * Add number of Events corresponding to counts to the events list.
+   * @param events Destination list.
+   * @param counts How many events to generate for each row:qualifier pair.
+   */
+  private void generateEvents(List<Event> events, Map<String, Long> counts) {
+    for (String key : counts.keySet()) {
+      long count = counts.get(key);
+      for (long i = 0; i < count; i++) {
+        events.add(EventBuilder.withBody(key, Charsets.UTF_8));
+      }
+    }
+  }
+
+  private Channel createAndConfigureMemoryChannel(HBaseSink sink) {
+    Channel channel = new MemoryChannel();
+    Context channelCtx = new Context();
+    channelCtx.put("capacity", String.valueOf(1000L));
+    channelCtx.put("transactionCapacity", String.valueOf(1000L));
+    Configurables.configure(channel, channelCtx);
+    sink.setChannel(channel);
+    channel.start();
+    return channel;
+  }
+
+  private void putEvents(Channel channel, Iterable<Event> events) {
+    Transaction tx = channel.getTransaction();
+    tx.begin();
+    for (Event event : events) {
+      channel.put(event);
+    }
+    tx.commit();
+    tx.close();
+  }
+
+}
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/src/test/java/org/apache/flume/sink/hbase/TestHBaseSinkCreation.java b/code/flume-ng-sinks/flume-ng-hbase-sink/src/test/java/org/apache/flume/sink/hbase/TestHBaseSinkCreation.java
new file mode 100644
index 0000000..115bc62
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-hbase-sink/src/test/java/org/apache/flume/sink/hbase/TestHBaseSinkCreation.java
@@ -0,0 +1,50 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.hbase;
+
+import org.apache.flume.FlumeException;
+import org.apache.flume.Sink;
+import org.apache.flume.SinkFactory;
+import org.apache.flume.sink.DefaultSinkFactory;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+
+public class TestHBaseSinkCreation {
+
+  private SinkFactory sinkFactory;
+
+  @Before
+  public void setUp() {
+    sinkFactory = new DefaultSinkFactory();
+  }
+
+  private void verifySinkCreation(String name, String type,
+      Class<?> typeClass) throws FlumeException {
+    Sink sink = sinkFactory.create(name, type);
+    Assert.assertNotNull(sink);
+    Assert.assertTrue(typeClass.isInstance(sink));
+  }
+
+  @Test
+  public void testSinkCreation() {
+    verifySinkCreation("hbase-sink", "hbase", HBaseSink.class);
+    verifySinkCreation("asynchbase-sink", "asynchbase", AsyncHBaseSink.class);
+  }
+}
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/src/test/java/org/apache/flume/sink/hbase/TestRegexHbaseEventSerializer.java b/code/flume-ng-sinks/flume-ng-hbase-sink/src/test/java/org/apache/flume/sink/hbase/TestRegexHbaseEventSerializer.java
new file mode 100644
index 0000000..24bcf37
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-hbase-sink/src/test/java/org/apache/flume/sink/hbase/TestRegexHbaseEventSerializer.java
@@ -0,0 +1,232 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.flume.sink.hbase;
+
+import com.google.common.collect.Maps;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.event.EventBuilder;
+import org.apache.hadoop.hbase.KeyValue;
+import org.apache.hadoop.hbase.client.Increment;
+import org.apache.hadoop.hbase.client.Put;
+import org.apache.hadoop.hbase.client.Row;
+import org.apache.hadoop.hbase.util.Bytes;
+import org.junit.Test;
+
+import java.nio.charset.Charset;
+import java.util.Calendar;
+import java.util.List;
+import java.util.Map;
+
+import static org.junit.Assert.assertArrayEquals;
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertTrue;
+import static org.mockito.Mockito.mock;
+import static org.mockito.Mockito.when;
+
+public class TestRegexHbaseEventSerializer {
+
+  @Test
+  /** Ensure that when no config is specified, the a catch-all regex is used 
+   *  with default column name. */
+  public void testDefaultBehavior() throws Exception {
+    RegexHbaseEventSerializer s = new RegexHbaseEventSerializer();
+    Context context = new Context();
+    s.configure(context);
+    String logMsg = "The sky is falling!";
+    Event e = EventBuilder.withBody(Bytes.toBytes(logMsg));
+    s.initialize(e, "CF".getBytes());
+    List<Row> actions = s.getActions();
+    assertTrue(actions.size() == 1);
+    assertTrue(actions.get(0) instanceof Put);
+    Put put = (Put) actions.get(0);
+    
+    assertTrue(put.getFamilyMap().containsKey(s.cf));
+    List<KeyValue> kvPairs = put.getFamilyMap().get(s.cf);
+    assertTrue(kvPairs.size() == 1);
+    
+    Map<String, String> resultMap = Maps.newHashMap();
+    for (KeyValue kv : kvPairs) {
+      resultMap.put(new String(kv.getQualifier()), new String(kv.getValue()));
+    }
+    
+    assertTrue(resultMap.containsKey(
+        RegexHbaseEventSerializer.COLUMN_NAME_DEFAULT));
+    assertEquals("The sky is falling!",
+        resultMap.get(RegexHbaseEventSerializer.COLUMN_NAME_DEFAULT));
+  }
+  @Test
+  public void testRowIndexKey() throws Exception {
+    RegexHbaseEventSerializer s = new RegexHbaseEventSerializer();
+    Context context = new Context();
+    context.put(RegexHbaseEventSerializer.REGEX_CONFIG,"^([^\t]+)\t([^\t]+)\t" + "([^\t]+)$");
+    context.put(RegexHbaseEventSerializer.COL_NAME_CONFIG, "col1,col2,ROW_KEY");
+    context.put("rowKeyIndex", "2");
+    s.configure(context);
+
+    String body = "val1\tval2\trow1";
+    Event e = EventBuilder.withBody(Bytes.toBytes(body));
+    s.initialize(e, "CF".getBytes());
+    List<Row> actions = s.getActions();
+
+    Put put = (Put)actions.get(0);
+
+    List<KeyValue> kvPairs = put.getFamilyMap().get(s.cf);
+    assertTrue(kvPairs.size() == 2);
+
+    Map<String, String> resultMap = Maps.newHashMap();
+    for (KeyValue kv : kvPairs) {
+      resultMap.put(new String(kv.getQualifier()), new String(kv.getValue()));
+    }
+    assertEquals("val1", resultMap.get("col1"));
+    assertEquals("val2", resultMap.get("col2"));
+    assertEquals("row1", Bytes.toString(put.getRow()));
+  }
+
+  @Test
+  /** Test a common case where regex is used to parse apache log format. */
+  public void testApacheRegex() throws Exception {
+    RegexHbaseEventSerializer s = new RegexHbaseEventSerializer();
+    Context context = new Context();
+    context.put(RegexHbaseEventSerializer.REGEX_CONFIG,
+        "([^ ]*) ([^ ]*) ([^ ]*) (-|\\[[^\\]]*\\]) \"([^ ]+) ([^ ]+)" +
+        " ([^\"]+)\" (-|[0-9]*) (-|[0-9]*)(?: ([^ \"]*|\"[^\"]*\")" +
+        " ([^ \"]*|\"[^\"]*\"))?");
+    context.put(RegexHbaseEventSerializer.COL_NAME_CONFIG,
+        "host,identity,user,time,method,request,protocol,status,size," +
+        "referer,agent");
+    s.configure(context);
+    String logMsg = "33.22.11.00 - - [20/May/2011:07:01:19 +0000] " +
+        "\"GET /wp-admin/css/install.css HTTP/1.0\" 200 813 " +
+        "\"http://www.cloudera.com/wp-admin/install.php\" \"Mozilla/5.0 (comp" +
+        "atible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)\"";
+    
+    Event e = EventBuilder.withBody(Bytes.toBytes(logMsg));
+    s.initialize(e, "CF".getBytes());
+    List<Row> actions = s.getActions();
+    assertEquals(1, s.getActions().size());
+    assertTrue(actions.get(0) instanceof Put);
+    
+    Put put = (Put) actions.get(0);
+    assertTrue(put.getFamilyMap().containsKey(s.cf));
+    List<KeyValue> kvPairs = put.getFamilyMap().get(s.cf);
+    assertTrue(kvPairs.size() == 11);
+    
+    Map<String, String> resultMap = Maps.newHashMap();
+    for (KeyValue kv : kvPairs) {
+      resultMap.put(new String(kv.getQualifier()), new String(kv.getValue()));
+    }
+    
+    assertEquals("33.22.11.00", resultMap.get("host"));
+    assertEquals("-", resultMap.get("identity"));
+    assertEquals("-", resultMap.get("user"));
+    assertEquals("[20/May/2011:07:01:19 +0000]", resultMap.get("time"));
+    assertEquals("GET", resultMap.get("method"));
+    assertEquals("/wp-admin/css/install.css", resultMap.get("request"));
+    assertEquals("HTTP/1.0", resultMap.get("protocol"));
+    assertEquals("200", resultMap.get("status"));
+    assertEquals("813", resultMap.get("size"));
+    assertEquals("\"http://www.cloudera.com/wp-admin/install.php\"", 
+        resultMap.get("referer"));
+    assertEquals("\"Mozilla/5.0 (compatible; Yahoo! Slurp; " +
+        "http://help.yahoo.com/help/us/ysearch/slurp)\"", 
+        resultMap.get("agent"));
+    
+    List<Increment> increments = s.getIncrements();
+    assertEquals(0, increments.size());
+  }
+  
+  @Test
+  public void testRowKeyGeneration() {
+    Context context = new Context();
+    RegexHbaseEventSerializer s1 = new RegexHbaseEventSerializer();
+    s1.configure(context);
+    RegexHbaseEventSerializer s2 = new RegexHbaseEventSerializer();
+    s2.configure(context);
+    
+    // Reset shared nonce to zero
+    RegexHbaseEventSerializer.nonce.set(0);
+    String randomString = RegexHbaseEventSerializer.randomKey;
+    
+    Event e1 = EventBuilder.withBody(Bytes.toBytes("body"));
+    Event e2 = EventBuilder.withBody(Bytes.toBytes("body"));
+    Event e3 = EventBuilder.withBody(Bytes.toBytes("body"));
+
+    Calendar cal = mock(Calendar.class);
+    when(cal.getTimeInMillis()).thenReturn(1L);
+    
+    s1.initialize(e1, "CF".getBytes());
+    String rk1 = new String(s1.getRowKey(cal));
+    assertEquals("1-" + randomString + "-0", rk1);
+    
+    when(cal.getTimeInMillis()).thenReturn(10L);
+    s1.initialize(e2, "CF".getBytes());
+    String rk2 = new String(s1.getRowKey(cal));
+    assertEquals("10-" + randomString + "-1", rk2);
+   
+    when(cal.getTimeInMillis()).thenReturn(100L);
+    s2.initialize(e3, "CF".getBytes());
+    String rk3 = new String(s2.getRowKey(cal));
+    assertEquals("100-" + randomString + "-2", rk3);
+    
+  }
+
+  @Test
+  /** Test depositing of the header information. */
+  public void testDepositHeaders() throws Exception {
+    Charset charset = Charset.forName("KOI8-R");
+    RegexHbaseEventSerializer s = new RegexHbaseEventSerializer();
+    Context context = new Context();
+    context.put(RegexHbaseEventSerializer.DEPOSIT_HEADERS_CONFIG,
+        "true");
+    context.put(RegexHbaseEventSerializer.CHARSET_CONFIG,
+               charset.toString());
+    s.configure(context);
+
+    String body = "body";
+    Map<String, String> headers = Maps.newHashMap();
+    headers.put("header1", "value1");
+    headers.put("заголовок2", "значение2");
+
+    Event e = EventBuilder.withBody(Bytes.toBytes(body), headers);
+    s.initialize(e, "CF".getBytes());
+    List<Row> actions = s.getActions();
+    assertEquals(1, s.getActions().size());
+    assertTrue(actions.get(0) instanceof Put);
+
+    Put put = (Put) actions.get(0);
+    assertTrue(put.getFamilyMap().containsKey(s.cf));
+    List<KeyValue> kvPairs = put.getFamilyMap().get(s.cf);
+    assertTrue(kvPairs.size() == 3);
+
+    Map<String, byte[]> resultMap = Maps.newHashMap();
+    for (KeyValue kv : kvPairs) {
+      resultMap.put(new String(kv.getQualifier(), charset), kv.getValue());
+    }
+
+    assertEquals(body,
+                 new String(resultMap.get(RegexHbaseEventSerializer.COLUMN_NAME_DEFAULT), charset));
+    assertEquals("value1", new String(resultMap.get("header1"), charset));
+    assertArrayEquals("значение2".getBytes(charset), resultMap.get("заголовок2"));
+    assertEquals("значение2".length(), resultMap.get("заголовок2").length);
+
+    List<Increment> increments = s.getIncrements();
+    assertEquals(0, increments.size());
+  }
+}
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/AsyncHBaseSink$1.class b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/AsyncHBaseSink$1.class
new file mode 100644
index 0000000..16cbbe1
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/AsyncHBaseSink$1.class differ
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/AsyncHBaseSink$2.class b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/AsyncHBaseSink$2.class
new file mode 100644
index 0000000..1d6c8ed
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/AsyncHBaseSink$2.class differ
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/AsyncHBaseSink$3.class b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/AsyncHBaseSink$3.class
new file mode 100644
index 0000000..a4bb8d2
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/AsyncHBaseSink$3.class differ
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/AsyncHBaseSink$4.class b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/AsyncHBaseSink$4.class
new file mode 100644
index 0000000..b0e527e
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/AsyncHBaseSink$4.class differ
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/AsyncHBaseSink$CellIdentifier.class b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/AsyncHBaseSink$CellIdentifier.class
new file mode 100644
index 0000000..e757011
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/AsyncHBaseSink$CellIdentifier.class differ
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/AsyncHBaseSink$FailureCallback.class b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/AsyncHBaseSink$FailureCallback.class
new file mode 100644
index 0000000..85a5a6e
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/AsyncHBaseSink$FailureCallback.class differ
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/AsyncHBaseSink$SuccessCallback.class b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/AsyncHBaseSink$SuccessCallback.class
new file mode 100644
index 0000000..ca7e8d5
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/AsyncHBaseSink$SuccessCallback.class differ
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/AsyncHBaseSink.class b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/AsyncHBaseSink.class
new file mode 100644
index 0000000..8c3b481
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/AsyncHBaseSink.class differ
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/AsyncHbaseEventSerializer.class b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/AsyncHbaseEventSerializer.class
new file mode 100644
index 0000000..77c22c3
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/AsyncHbaseEventSerializer.class differ
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/BatchAware.class b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/BatchAware.class
new file mode 100644
index 0000000..c2d18e7
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/BatchAware.class differ
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/HBaseSink$1.class b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/HBaseSink$1.class
new file mode 100644
index 0000000..25a444f
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/HBaseSink$1.class differ
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/HBaseSink$2.class b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/HBaseSink$2.class
new file mode 100644
index 0000000..4d1a284
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/HBaseSink$2.class differ
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/HBaseSink$3.class b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/HBaseSink$3.class
new file mode 100644
index 0000000..ddf1f02
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/HBaseSink$3.class differ
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/HBaseSink$4.class b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/HBaseSink$4.class
new file mode 100644
index 0000000..57677c2
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/HBaseSink$4.class differ
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/HBaseSink$DebugIncrementsCallback.class b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/HBaseSink$DebugIncrementsCallback.class
new file mode 100644
index 0000000..a93f223
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/HBaseSink$DebugIncrementsCallback.class differ
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/HBaseSink.class b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/HBaseSink.class
new file mode 100644
index 0000000..9107bc8
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/HBaseSink.class differ
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/HBaseSinkConfigurationConstants.class b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/HBaseSinkConfigurationConstants.class
new file mode 100644
index 0000000..45a7d07
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/HBaseSinkConfigurationConstants.class differ
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/HbaseEventSerializer.class b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/HbaseEventSerializer.class
new file mode 100644
index 0000000..8bbb189
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/HbaseEventSerializer.class differ
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/KfkAsyncHbaseEventSerializer.class b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/KfkAsyncHbaseEventSerializer.class
new file mode 100644
index 0000000..5bdd7ce
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/KfkAsyncHbaseEventSerializer.class differ
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/RegexHbaseEventSerializer.class b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/RegexHbaseEventSerializer.class
new file mode 100644
index 0000000..9575d0a
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/RegexHbaseEventSerializer.class differ
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/SimpleAsyncHbaseEventSerializer$1.class b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/SimpleAsyncHbaseEventSerializer$1.class
new file mode 100644
index 0000000..7b9558e
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/SimpleAsyncHbaseEventSerializer$1.class differ
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/SimpleAsyncHbaseEventSerializer.class b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/SimpleAsyncHbaseEventSerializer.class
new file mode 100644
index 0000000..bb34c15
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/SimpleAsyncHbaseEventSerializer.class differ
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/SimpleHbaseEventSerializer$KeyType.class b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/SimpleHbaseEventSerializer$KeyType.class
new file mode 100644
index 0000000..7576159
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/SimpleHbaseEventSerializer$KeyType.class differ
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/SimpleHbaseEventSerializer.class b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/SimpleHbaseEventSerializer.class
new file mode 100644
index 0000000..e375eca
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/SimpleHbaseEventSerializer.class differ
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/SimpleRowKeyGenerator.class b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/SimpleRowKeyGenerator.class
new file mode 100644
index 0000000..037185c
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-hbase-sink/target/classes/org/apache/flume/sink/hbase/SimpleRowKeyGenerator.class differ
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/target/test-classes/org/apache/flume/sink/hbase/IncrementAsyncHBaseSerializer.class b/code/flume-ng-sinks/flume-ng-hbase-sink/target/test-classes/org/apache/flume/sink/hbase/IncrementAsyncHBaseSerializer.class
new file mode 100644
index 0000000..d8a5255
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-hbase-sink/target/test-classes/org/apache/flume/sink/hbase/IncrementAsyncHBaseSerializer.class differ
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/target/test-classes/org/apache/flume/sink/hbase/IncrementHBaseSerializer.class b/code/flume-ng-sinks/flume-ng-hbase-sink/target/test-classes/org/apache/flume/sink/hbase/IncrementHBaseSerializer.class
new file mode 100644
index 0000000..39cde0f
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-hbase-sink/target/test-classes/org/apache/flume/sink/hbase/IncrementHBaseSerializer.class differ
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/target/test-classes/org/apache/flume/sink/hbase/MockSimpleHbaseEventSerializer.class b/code/flume-ng-sinks/flume-ng-hbase-sink/target/test-classes/org/apache/flume/sink/hbase/MockSimpleHbaseEventSerializer.class
new file mode 100644
index 0000000..4ced24c
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-hbase-sink/target/test-classes/org/apache/flume/sink/hbase/MockSimpleHbaseEventSerializer.class differ
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/target/test-classes/org/apache/flume/sink/hbase/TestAsyncHBaseSink.class b/code/flume-ng-sinks/flume-ng-hbase-sink/target/test-classes/org/apache/flume/sink/hbase/TestAsyncHBaseSink.class
new file mode 100644
index 0000000..dc6a15f
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-hbase-sink/target/test-classes/org/apache/flume/sink/hbase/TestAsyncHBaseSink.class differ
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/target/test-classes/org/apache/flume/sink/hbase/TestHBaseSink$CoalesceValidator.class b/code/flume-ng-sinks/flume-ng-hbase-sink/target/test-classes/org/apache/flume/sink/hbase/TestHBaseSink$CoalesceValidator.class
new file mode 100644
index 0000000..db7933e
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-hbase-sink/target/test-classes/org/apache/flume/sink/hbase/TestHBaseSink$CoalesceValidator.class differ
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/target/test-classes/org/apache/flume/sink/hbase/TestHBaseSink.class b/code/flume-ng-sinks/flume-ng-hbase-sink/target/test-classes/org/apache/flume/sink/hbase/TestHBaseSink.class
new file mode 100644
index 0000000..4689132
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-hbase-sink/target/test-classes/org/apache/flume/sink/hbase/TestHBaseSink.class differ
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/target/test-classes/org/apache/flume/sink/hbase/TestHBaseSinkCreation.class b/code/flume-ng-sinks/flume-ng-hbase-sink/target/test-classes/org/apache/flume/sink/hbase/TestHBaseSinkCreation.class
new file mode 100644
index 0000000..703be15
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-hbase-sink/target/test-classes/org/apache/flume/sink/hbase/TestHBaseSinkCreation.class differ
diff --git a/code/flume-ng-sinks/flume-ng-hbase-sink/target/test-classes/org/apache/flume/sink/hbase/TestRegexHbaseEventSerializer.class b/code/flume-ng-sinks/flume-ng-hbase-sink/target/test-classes/org/apache/flume/sink/hbase/TestRegexHbaseEventSerializer.class
new file mode 100644
index 0000000..cfd3f0b
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-hbase-sink/target/test-classes/org/apache/flume/sink/hbase/TestRegexHbaseEventSerializer.class differ
diff --git a/code/flume-ng-sinks/flume-ng-kafka-sink/pom.xml b/code/flume-ng-sinks/flume-ng-kafka-sink/pom.xml
new file mode 100644
index 0000000..8ad229e
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-kafka-sink/pom.xml
@@ -0,0 +1,91 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!-- Licensed to the Apache Software Foundation (ASF) under one or more contributor
+  license agreements. See the NOTICE file distributed with this work for additional
+  information regarding copyright ownership. The ASF licenses this file to
+  You under the Apache License, Version 2.0 (the "License"); you may not use
+  this file except in compliance with the License. You may obtain a copy of
+  the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required
+  by applicable law or agreed to in writing, software distributed under the
+  License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS
+  OF ANY KIND, either express or implied. See the License for the specific
+  language governing permissions and limitations under the License. -->
+<project xmlns="http://maven.apache.org/POM/4.0.0"
+         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
+  <modelVersion>4.0.0</modelVersion>
+  <parent>
+    <artifactId>flume-ng-sinks</artifactId>
+    <groupId>org.apache.flume</groupId>
+    <version>1.7.0</version>
+  </parent>
+  <groupId>org.apache.flume.flume-ng-sinks</groupId>
+  <artifactId>flume-ng-kafka-sink</artifactId>
+  <name>Flume Kafka Sink</name>
+
+  <build>
+    <plugins>
+      <plugin>
+        <groupId>org.apache.rat</groupId>
+        <artifactId>apache-rat-plugin</artifactId>
+      </plugin>
+      <plugin>
+        <groupId>org.apache.maven.plugins</groupId>
+        <artifactId>maven-jar-plugin</artifactId>
+        <executions>
+          <execution>
+            <goals>
+              <goal>test-jar</goal>
+            </goals>
+          </execution>
+        </executions>
+      </plugin>
+    </plugins>
+  </build>
+
+  <dependencies>
+    <dependency>
+      <groupId>org.apache.flume</groupId>
+      <artifactId>flume-ng-sdk</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.flume</groupId>
+      <artifactId>flume-ng-core</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.flume</groupId>
+      <artifactId>flume-ng-configuration</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.flume.flume-shared</groupId>
+      <artifactId>flume-shared-kafka-test</artifactId>
+      <scope>test</scope>
+    </dependency>
+
+    <dependency>
+      <groupId>org.slf4j</groupId>
+      <artifactId>slf4j-api</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>junit</groupId>
+      <artifactId>junit</artifactId>
+      <scope>test</scope>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.kafka</groupId>
+      <artifactId>kafka_2.10</artifactId>
+      <scope>test</scope>
+    </dependency>
+    <dependency>
+      <groupId>org.apache.kafka</groupId>
+      <artifactId>kafka-clients</artifactId>
+      <version>${kafka.version}</version>
+    </dependency>
+
+  </dependencies>
+
+</project>
diff --git a/code/flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java b/code/flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java
new file mode 100644
index 0000000..dd40224
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java
@@ -0,0 +1,460 @@
+/**
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements.  See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License.  You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+ limitations under the License.
+ */
+
+package org.apache.flume.sink.kafka;
+
+import com.google.common.base.Optional;
+import com.google.common.base.Throwables;
+import org.apache.avro.io.BinaryEncoder;
+import org.apache.avro.io.EncoderFactory;
+import org.apache.avro.specific.SpecificDatumReader;
+import org.apache.avro.specific.SpecificDatumWriter;
+import org.apache.flume.Channel;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.EventDeliveryException;
+import org.apache.flume.Transaction;
+import org.apache.flume.conf.Configurable;
+import org.apache.flume.conf.ConfigurationException;
+import org.apache.flume.conf.LogPrivacyUtil;
+import org.apache.flume.instrumentation.kafka.KafkaSinkCounter;
+import org.apache.flume.sink.AbstractSink;
+import org.apache.flume.source.avro.AvroFlumeEvent;
+import org.apache.kafka.clients.producer.Callback;
+import org.apache.kafka.clients.producer.KafkaProducer;
+import org.apache.kafka.clients.producer.ProducerConfig;
+import org.apache.kafka.clients.producer.ProducerRecord;
+import org.apache.kafka.clients.producer.RecordMetadata;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.util.HashMap;
+import java.util.LinkedList;
+import java.util.List;
+import java.util.Map;
+import java.util.Properties;
+import java.util.concurrent.Future;
+
+import static org.apache.flume.sink.kafka.KafkaSinkConstants.BOOTSTRAP_SERVERS_CONFIG;
+import static org.apache.flume.sink.kafka.KafkaSinkConstants.BATCH_SIZE;
+import static org.apache.flume.sink.kafka.KafkaSinkConstants.DEFAULT_BATCH_SIZE;
+import static org.apache.flume.sink.kafka.KafkaSinkConstants.BROKER_LIST_FLUME_KEY;
+import static org.apache.flume.sink.kafka.KafkaSinkConstants.DEFAULT_ACKS;
+import static org.apache.flume.sink.kafka.KafkaSinkConstants.DEFAULT_KEY_SERIALIZER;
+import static org.apache.flume.sink.kafka.KafkaSinkConstants.DEFAULT_TOPIC;
+import static org.apache.flume.sink.kafka.KafkaSinkConstants.DEFAULT_VALUE_SERIAIZER;
+import static org.apache.flume.sink.kafka.KafkaSinkConstants.KAFKA_PRODUCER_PREFIX;
+import static org.apache.flume.sink.kafka.KafkaSinkConstants.KEY_HEADER;
+import static org.apache.flume.sink.kafka.KafkaSinkConstants.OLD_BATCH_SIZE;
+import static org.apache.flume.sink.kafka.KafkaSinkConstants.REQUIRED_ACKS_FLUME_KEY;
+import static org.apache.flume.sink.kafka.KafkaSinkConstants.TOPIC_CONFIG;
+import static org.apache.flume.sink.kafka.KafkaSinkConstants.TOPIC_HEADER;
+import static org.apache.flume.sink.kafka.KafkaSinkConstants.KEY_SERIALIZER_KEY;
+import static org.apache.flume.sink.kafka.KafkaSinkConstants.MESSAGE_SERIALIZER_KEY;
+
+
+/**
+ * A Flume Sink that can publish messages to Kafka.
+ * This is a general implementation that can be used with any Flume agent and
+ * a channel.
+ * The message can be any event and the key is a string that we read from the
+ * header
+ * For use of partitioning, use an interceptor to generate a header with the
+ * partition key
+ * <p/>
+ * Mandatory properties are:
+ * brokerList -- can be a partial list, but at least 2 are recommended for HA
+ * <p/>
+ * <p/>
+ * however, any property starting with "kafka." will be passed along to the
+ * Kafka producer
+ * Read the Kafka producer documentation to see which configurations can be used
+ * <p/>
+ * Optional properties
+ * topic - there's a default, and also - this can be in the event header if
+ * you need to support events with
+ * different topics
+ * batchSize - how many messages to process in one batch. Larger batches
+ * improve throughput while adding latency.
+ * requiredAcks -- 0 (unsafe), 1 (accepted by at least one broker, default),
+ * -1 (accepted by all brokers)
+ * useFlumeEventFormat - preserves event headers when serializing onto Kafka
+ * <p/>
+ * header properties (per event):
+ * topic
+ * key
+ */
+public class KafkaSink extends AbstractSink implements Configurable {
+
+  private static final Logger logger = LoggerFactory.getLogger(KafkaSink.class);
+
+  private final Properties kafkaProps = new Properties();
+  private KafkaProducer<String, byte[]> producer;
+
+  private String topic;
+  private int batchSize;
+  private List<Future<RecordMetadata>> kafkaFutures;
+  private KafkaSinkCounter counter;
+  private boolean useAvroEventFormat;
+  private String partitionHeader = null;
+  private Integer staticPartitionId = null;
+  private Optional<SpecificDatumWriter<AvroFlumeEvent>> writer =
+          Optional.absent();
+  private Optional<SpecificDatumReader<AvroFlumeEvent>> reader =
+          Optional.absent();
+  private Optional<ByteArrayOutputStream> tempOutStream = Optional
+          .absent();
+
+  //Fine to use null for initial value, Avro will create new ones if this
+  // is null
+  private BinaryEncoder encoder = null;
+
+
+  //For testing
+  public String getTopic() {
+    return topic;
+  }
+
+  public int getBatchSize() {
+    return batchSize;
+  }
+
+  @Override
+  public Status process() throws EventDeliveryException {
+    Status result = Status.READY;
+    Channel channel = getChannel();
+    Transaction transaction = null;
+    Event event = null;
+    String eventTopic = null;
+    String eventKey = null;
+
+    try {
+      long processedEvents = 0;
+
+      transaction = channel.getTransaction();
+      transaction.begin();
+
+      kafkaFutures.clear();
+      long batchStartTime = System.nanoTime();
+      for (; processedEvents < batchSize; processedEvents += 1) {
+        event = channel.take();
+
+        if (event == null) {
+          // no events available in channel
+          if (processedEvents == 0) {
+            result = Status.BACKOFF;
+            counter.incrementBatchEmptyCount();
+          } else {
+            counter.incrementBatchUnderflowCount();
+          }
+          break;
+        }
+
+        byte[] eventBody = event.getBody();
+        Map<String, String> headers = event.getHeaders();
+
+        eventTopic = headers.get(TOPIC_HEADER);
+        if (eventTopic == null) {
+          eventTopic = topic;
+        }
+        eventKey = headers.get(KEY_HEADER);
+        if (logger.isTraceEnabled()) {
+          if (LogPrivacyUtil.allowLogRawData()) {
+            logger.trace("{Event} " + eventTopic + " : " + eventKey + " : "
+                + new String(eventBody, "UTF-8"));
+          } else {
+            logger.trace("{Event} " + eventTopic + " : " + eventKey);
+          }
+        }
+        logger.debug("event #{}", processedEvents);
+
+        // create a message and add to buffer
+        long startTime = System.currentTimeMillis();
+
+        Integer partitionId = null;
+        try {
+          ProducerRecord<String, byte[]> record;
+          if (staticPartitionId != null) {
+            partitionId = staticPartitionId;
+          }
+          //Allow a specified header to override a static ID
+          if (partitionHeader != null) {
+            String headerVal = event.getHeaders().get(partitionHeader);
+            if (headerVal != null) {
+              partitionId = Integer.parseInt(headerVal);
+            }
+          }
+          if (partitionId != null) {
+            record = new ProducerRecord<String, byte[]>(eventTopic, partitionId, eventKey,
+                serializeEvent(event, useAvroEventFormat));
+          } else {
+            record = new ProducerRecord<String, byte[]>(eventTopic, eventKey,
+                serializeEvent(event, useAvroEventFormat));
+          }
+          kafkaFutures.add(producer.send(record, new SinkCallback(startTime)));
+        } catch (NumberFormatException ex) {
+          throw new EventDeliveryException("Non integer partition id specified", ex);
+        } catch (Exception ex) {
+          // N.B. The producer.send() method throws all sorts of RuntimeExceptions
+          // Catching Exception here to wrap them neatly in an EventDeliveryException
+          // which is what our consumers will expect
+          throw new EventDeliveryException("Could not send event", ex);
+        }
+      }
+
+      //Prevent linger.ms from holding the batch
+      producer.flush();
+
+      // publish batch and commit.
+      if (processedEvents > 0) {
+        for (Future<RecordMetadata> future : kafkaFutures) {
+          future.get();
+        }
+        long endTime = System.nanoTime();
+        counter.addToKafkaEventSendTimer((endTime - batchStartTime) / (1000 * 1000));
+        counter.addToEventDrainSuccessCount(Long.valueOf(kafkaFutures.size()));
+      }
+
+      transaction.commit();
+
+    } catch (Exception ex) {
+      String errorMsg = "Failed to publish events";
+      logger.error("Failed to publish events", ex);
+      result = Status.BACKOFF;
+      if (transaction != null) {
+        try {
+          kafkaFutures.clear();
+          transaction.rollback();
+          counter.incrementRollbackCount();
+        } catch (Exception e) {
+          logger.error("Transaction rollback failed", e);
+          throw Throwables.propagate(e);
+        }
+      }
+      throw new EventDeliveryException(errorMsg, ex);
+    } finally {
+      if (transaction != null) {
+        transaction.close();
+      }
+    }
+
+    return result;
+  }
+
+  @Override
+  public synchronized void start() {
+    // instantiate the producer
+    producer = new KafkaProducer<String,byte[]>(kafkaProps);
+    counter.start();
+    super.start();
+  }
+
+  @Override
+  public synchronized void stop() {
+    producer.close();
+    counter.stop();
+    logger.info("Kafka Sink {} stopped. Metrics: {}", getName(), counter);
+    super.stop();
+  }
+
+
+  /**
+   * We configure the sink and generate properties for the Kafka Producer
+   *
+   * Kafka producer properties is generated as follows:
+   * 1. We generate a properties object with some static defaults that
+   * can be overridden by Sink configuration
+   * 2. We add the configuration users added for Kafka (parameters starting
+   * with .kafka. and must be valid Kafka Producer properties
+   * 3. We add the sink's documented parameters which can override other
+   * properties
+   *
+   * @param context
+   */
+  @Override
+  public void configure(Context context) {
+
+    translateOldProps(context);
+
+    String topicStr = context.getString(TOPIC_CONFIG);
+    if (topicStr == null || topicStr.isEmpty()) {
+      topicStr = DEFAULT_TOPIC;
+      logger.warn("Topic was not specified. Using {} as the topic.", topicStr);
+    } else {
+      logger.info("Using the static topic {}. This may be overridden by event headers", topicStr);
+    }
+
+    topic = topicStr;
+
+    batchSize = context.getInteger(BATCH_SIZE, DEFAULT_BATCH_SIZE);
+
+    if (logger.isDebugEnabled()) {
+      logger.debug("Using batch size: {}", batchSize);
+    }
+
+    useAvroEventFormat = context.getBoolean(KafkaSinkConstants.AVRO_EVENT,
+                                            KafkaSinkConstants.DEFAULT_AVRO_EVENT);
+
+    partitionHeader = context.getString(KafkaSinkConstants.PARTITION_HEADER_NAME);
+    staticPartitionId = context.getInteger(KafkaSinkConstants.STATIC_PARTITION_CONF);
+
+    if (logger.isDebugEnabled()) {
+      logger.debug(KafkaSinkConstants.AVRO_EVENT + " set to: {}", useAvroEventFormat);
+    }
+
+    kafkaFutures = new LinkedList<Future<RecordMetadata>>();
+
+    String bootStrapServers = context.getString(BOOTSTRAP_SERVERS_CONFIG);
+    if (bootStrapServers == null || bootStrapServers.isEmpty()) {
+      throw new ConfigurationException("Bootstrap Servers must be specified");
+    }
+
+    setProducerProps(context, bootStrapServers);
+
+    if (logger.isDebugEnabled() && LogPrivacyUtil.allowLogPrintConfig()) {
+      logger.debug("Kafka producer properties: {}", kafkaProps);
+    }
+
+    if (counter == null) {
+      counter = new KafkaSinkCounter(getName());
+    }
+  }
+
+  private void translateOldProps(Context ctx) {
+
+    if (!(ctx.containsKey(TOPIC_CONFIG))) {
+      ctx.put(TOPIC_CONFIG, ctx.getString("topic"));
+      logger.warn("{} is deprecated. Please use the parameter {}", "topic", TOPIC_CONFIG);
+    }
+
+    //Broker List
+    // If there is no value we need to check and set the old param and log a warning message
+    if (!(ctx.containsKey(BOOTSTRAP_SERVERS_CONFIG))) {
+      String brokerList = ctx.getString(BROKER_LIST_FLUME_KEY);
+      if (brokerList == null || brokerList.isEmpty()) {
+        throw new ConfigurationException("Bootstrap Servers must be specified");
+      } else {
+        ctx.put(BOOTSTRAP_SERVERS_CONFIG, brokerList);
+        logger.warn("{} is deprecated. Please use the parameter {}",
+                    BROKER_LIST_FLUME_KEY, BOOTSTRAP_SERVERS_CONFIG);
+      }
+    }
+
+    //batch Size
+    if (!(ctx.containsKey(BATCH_SIZE))) {
+      String oldBatchSize = ctx.getString(OLD_BATCH_SIZE);
+      if ( oldBatchSize != null  && !oldBatchSize.isEmpty())  {
+        ctx.put(BATCH_SIZE, oldBatchSize);
+        logger.warn("{} is deprecated. Please use the parameter {}", OLD_BATCH_SIZE, BATCH_SIZE);
+      }
+    }
+
+    // Acks
+    if (!(ctx.containsKey(KAFKA_PRODUCER_PREFIX + ProducerConfig.ACKS_CONFIG))) {
+      String requiredKey = ctx.getString(
+              KafkaSinkConstants.REQUIRED_ACKS_FLUME_KEY);
+      if (!(requiredKey == null) && !(requiredKey.isEmpty())) {
+        ctx.put(KAFKA_PRODUCER_PREFIX + ProducerConfig.ACKS_CONFIG, requiredKey);
+        logger.warn("{} is deprecated. Please use the parameter {}", REQUIRED_ACKS_FLUME_KEY,
+                KAFKA_PRODUCER_PREFIX + ProducerConfig.ACKS_CONFIG);
+      }
+    }
+
+    if (ctx.containsKey(KEY_SERIALIZER_KEY )) {
+      logger.warn("{} is deprecated. Flume now uses the latest Kafka producer which implements " +
+          "a different interface for serializers. Please use the parameter {}",
+          KEY_SERIALIZER_KEY,KAFKA_PRODUCER_PREFIX + ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG);
+    }
+
+    if (ctx.containsKey(MESSAGE_SERIALIZER_KEY)) {
+      logger.warn("{} is deprecated. Flume now uses the latest Kafka producer which implements " +
+                  "a different interface for serializers. Please use the parameter {}",
+                  MESSAGE_SERIALIZER_KEY,
+                  KAFKA_PRODUCER_PREFIX + ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG);
+    }
+  }
+
+  private void setProducerProps(Context context, String bootStrapServers) {
+    kafkaProps.put(ProducerConfig.ACKS_CONFIG, DEFAULT_ACKS);
+    //Defaults overridden based on config
+    kafkaProps.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, DEFAULT_KEY_SERIALIZER);
+    kafkaProps.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, DEFAULT_VALUE_SERIAIZER);
+    kafkaProps.putAll(context.getSubProperties(KAFKA_PRODUCER_PREFIX));
+    kafkaProps.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, bootStrapServers);
+  }
+
+  protected Properties getKafkaProps() {
+    return kafkaProps;
+  }
+
+  private byte[] serializeEvent(Event event, boolean useAvroEventFormat) throws IOException {
+    byte[] bytes;
+    if (useAvroEventFormat) {
+      if (!tempOutStream.isPresent()) {
+        tempOutStream = Optional.of(new ByteArrayOutputStream());
+      }
+      if (!writer.isPresent()) {
+        writer = Optional.of(new SpecificDatumWriter<AvroFlumeEvent>(AvroFlumeEvent.class));
+      }
+      tempOutStream.get().reset();
+      AvroFlumeEvent e = new AvroFlumeEvent(toCharSeqMap(event.getHeaders()),
+                                            ByteBuffer.wrap(event.getBody()));
+      encoder = EncoderFactory.get().directBinaryEncoder(tempOutStream.get(), encoder);
+      writer.get().write(e, encoder);
+      encoder.flush();
+      bytes = tempOutStream.get().toByteArray();
+    } else {
+      bytes = event.getBody();
+    }
+    return bytes;
+  }
+
+  private static Map<CharSequence, CharSequence> toCharSeqMap(Map<String, String> stringMap) {
+    Map<CharSequence, CharSequence> charSeqMap = new HashMap<CharSequence, CharSequence>();
+    for (Map.Entry<String, String> entry : stringMap.entrySet()) {
+      charSeqMap.put(entry.getKey(), entry.getValue());
+    }
+    return charSeqMap;
+  }
+
+}
+
+class SinkCallback implements Callback {
+  private static final Logger logger = LoggerFactory.getLogger(SinkCallback.class);
+  private long startTime;
+
+  public SinkCallback(long startTime) {
+    this.startTime = startTime;
+  }
+
+  public void onCompletion(RecordMetadata metadata, Exception exception) {
+    if (exception != null) {
+      logger.debug("Error sending message to Kafka {} ", exception.getMessage());
+    }
+
+    if (logger.isDebugEnabled()) {
+      long eventElapsedTime = System.currentTimeMillis() - startTime;
+      logger.debug("Acked message partition:{} ofset:{}",  metadata.partition(), metadata.offset());
+      logger.debug("Elapsed time for send: {}", eventElapsedTime);
+    }
+  }
+}
+
diff --git a/code/flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSinkConstants.java b/code/flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSinkConstants.java
new file mode 100644
index 0000000..7c819f5
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSinkConstants.java
@@ -0,0 +1,63 @@
+/**
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements.  See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License.  You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+ limitations under the License.
+ */
+
+package org.apache.flume.sink.kafka;
+
+import org.apache.kafka.clients.CommonClientConfigs;
+
+public class KafkaSinkConstants {
+
+  public static final String KAFKA_PREFIX = "kafka.";
+  public static final String KAFKA_PRODUCER_PREFIX = KAFKA_PREFIX + "producer.";
+
+  /* Properties */
+
+  public static final String TOPIC_CONFIG = KAFKA_PREFIX + "topic";
+  public static final String BATCH_SIZE = "flumeBatchSize";
+  public static final String BOOTSTRAP_SERVERS_CONFIG =
+      KAFKA_PREFIX + CommonClientConfigs.BOOTSTRAP_SERVERS_CONFIG;
+
+  public static final String KEY_HEADER = "key";
+  public static final String TOPIC_HEADER = "topic";
+
+  public static final String AVRO_EVENT = "useFlumeEventFormat";
+  public static final boolean DEFAULT_AVRO_EVENT = false;
+
+  public static final String PARTITION_HEADER_NAME = "partitionIdHeader";
+  public static final String STATIC_PARTITION_CONF = "defaultPartitionId";
+
+  public static final String DEFAULT_KEY_SERIALIZER =
+      "org.apache.kafka.common.serialization.StringSerializer";
+  public static final String DEFAULT_VALUE_SERIAIZER =
+      "org.apache.kafka.common.serialization.ByteArraySerializer";
+
+  public static final int DEFAULT_BATCH_SIZE = 100;
+  public static final String DEFAULT_TOPIC = "default-flume-topic";
+  public static final String DEFAULT_ACKS = "1";
+
+  /* Old Properties */
+
+  /* Properties */
+
+  public static final String OLD_BATCH_SIZE = "batchSize";
+  public static final String MESSAGE_SERIALIZER_KEY = "serializer.class";
+  public static final String KEY_SERIALIZER_KEY = "key.serializer.class";
+  public static final String BROKER_LIST_FLUME_KEY = "brokerList";
+  public static final String REQUIRED_ACKS_FLUME_KEY = "requiredAcks";
+}
+
diff --git a/code/flume-ng-sinks/flume-ng-kafka-sink/src/test/java/org/apache/flume/sink/kafka/TestConstants.java b/code/flume-ng-sinks/flume-ng-kafka-sink/src/test/java/org/apache/flume/sink/kafka/TestConstants.java
new file mode 100644
index 0000000..6d85700
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-kafka-sink/src/test/java/org/apache/flume/sink/kafka/TestConstants.java
@@ -0,0 +1,27 @@
+/**
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements.  See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License.  You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+ limitations under the License.
+ */
+
+package org.apache.flume.sink.kafka;
+
+public class TestConstants {
+  public static final String STATIC_TOPIC = "static-topic";
+  public static final String CUSTOM_KEY = "custom-key";
+  public static final String CUSTOM_TOPIC = "custom-topic";
+  public static final String HEADER_1_VALUE = "test-avro-header";
+  public static final String HEADER_1_KEY = "header1";
+}
diff --git a/code/flume-ng-sinks/flume-ng-kafka-sink/src/test/java/org/apache/flume/sink/kafka/TestKafkaSink.java b/code/flume-ng-sinks/flume-ng-kafka-sink/src/test/java/org/apache/flume/sink/kafka/TestKafkaSink.java
new file mode 100644
index 0000000..7eccf76
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-kafka-sink/src/test/java/org/apache/flume/sink/kafka/TestKafkaSink.java
@@ -0,0 +1,550 @@
+/**
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements.  See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License.  You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+ limitations under the License.
+ */
+
+package org.apache.flume.sink.kafka;
+
+import com.google.common.base.Charsets;
+
+import kafka.admin.AdminUtils;
+import kafka.message.MessageAndMetadata;
+import kafka.utils.ZkUtils;
+
+import org.apache.avro.io.BinaryDecoder;
+import org.apache.avro.io.DecoderFactory;
+import org.apache.avro.specific.SpecificDatumReader;
+import org.apache.avro.util.Utf8;
+import org.apache.commons.lang.RandomStringUtils;
+import org.apache.flume.Channel;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.EventDeliveryException;
+import org.apache.flume.Sink;
+import org.apache.flume.Transaction;
+import org.apache.flume.channel.MemoryChannel;
+import org.apache.flume.conf.Configurables;
+import org.apache.flume.event.EventBuilder;
+import org.apache.flume.shared.kafka.test.KafkaPartitionTestUtil;
+import org.apache.flume.shared.kafka.test.PartitionOption;
+import org.apache.flume.shared.kafka.test.PartitionTestScenario;
+import org.apache.flume.sink.kafka.util.TestUtil;
+import org.apache.flume.source.avro.AvroFlumeEvent;
+import org.apache.kafka.clients.CommonClientConfigs;
+import org.apache.kafka.clients.producer.ProducerConfig;
+import org.junit.AfterClass;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+import java.io.ByteArrayInputStream;
+import java.io.IOException;
+import java.io.UnsupportedEncodingException;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Properties;
+import java.util.Set;
+
+import static org.apache.flume.sink.kafka.KafkaSinkConstants.AVRO_EVENT;
+import static org.apache.flume.sink.kafka.KafkaSinkConstants.BATCH_SIZE;
+import static org.apache.flume.sink.kafka.KafkaSinkConstants.BOOTSTRAP_SERVERS_CONFIG;
+import static org.apache.flume.sink.kafka.KafkaSinkConstants.BROKER_LIST_FLUME_KEY;
+import static org.apache.flume.sink.kafka.KafkaSinkConstants.DEFAULT_KEY_SERIALIZER;
+import static org.apache.flume.sink.kafka.KafkaSinkConstants.DEFAULT_TOPIC;
+import static org.apache.flume.sink.kafka.KafkaSinkConstants.KAFKA_PREFIX;
+import static org.apache.flume.sink.kafka.KafkaSinkConstants.KAFKA_PRODUCER_PREFIX;
+import static org.apache.flume.sink.kafka.KafkaSinkConstants.OLD_BATCH_SIZE;
+import static org.apache.flume.sink.kafka.KafkaSinkConstants.REQUIRED_ACKS_FLUME_KEY;
+import static org.apache.flume.sink.kafka.KafkaSinkConstants.TOPIC_CONFIG;
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertNull;
+import static org.junit.Assert.fail;
+
+/**
+ * Unit tests for Kafka Sink
+ */
+public class TestKafkaSink {
+
+  private static TestUtil testUtil = TestUtil.getInstance();
+  private final Set<String> usedTopics = new HashSet<String>();
+
+  @BeforeClass
+  public static void setup() {
+    testUtil.prepare();
+    List<String> topics = new ArrayList<String>(3);
+    topics.add(DEFAULT_TOPIC);
+    topics.add(TestConstants.STATIC_TOPIC);
+    topics.add(TestConstants.CUSTOM_TOPIC);
+    testUtil.initTopicList(topics);
+  }
+
+  @AfterClass
+  public static void tearDown() {
+    testUtil.tearDown();
+  }
+
+  @Test
+  public void testKafkaProperties() {
+
+    KafkaSink kafkaSink = new KafkaSink();
+    Context context = new Context();
+    context.put(KAFKA_PREFIX + TOPIC_CONFIG, "");
+    context.put(KAFKA_PRODUCER_PREFIX + ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,
+                "override.default.serializer");
+    context.put("kafka.producer.fake.property", "kafka.property.value");
+    context.put("kafka.bootstrap.servers", "localhost:9092,localhost:9092");
+    context.put("brokerList", "real-broker-list");
+    Configurables.configure(kafkaSink, context);
+
+    Properties kafkaProps = kafkaSink.getKafkaProps();
+
+    //check that we have defaults set
+    assertEquals(kafkaProps.getProperty(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG),
+                 DEFAULT_KEY_SERIALIZER);
+    //check that kafka properties override the default and get correct name
+    assertEquals(kafkaProps.getProperty(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG),
+                 "override.default.serializer");
+    //check that any kafka-producer property gets in
+    assertEquals(kafkaProps.getProperty("fake.property"),
+                 "kafka.property.value");
+    //check that documented property overrides defaults
+    assertEquals(kafkaProps.getProperty("bootstrap.servers"),
+                 "localhost:9092,localhost:9092");
+  }
+
+  @Test
+  public void testOldProperties() {
+    KafkaSink kafkaSink = new KafkaSink();
+    Context context = new Context();
+    context.put("topic", "test-topic");
+    context.put(OLD_BATCH_SIZE, "300");
+    context.put(BROKER_LIST_FLUME_KEY, "localhost:9092,localhost:9092");
+    context.put(REQUIRED_ACKS_FLUME_KEY, "all");
+    Configurables.configure(kafkaSink, context);
+
+    Properties kafkaProps = kafkaSink.getKafkaProps();
+
+    assertEquals(kafkaSink.getTopic(), "test-topic");
+    assertEquals(kafkaSink.getBatchSize(), 300);
+    assertEquals(kafkaProps.getProperty(CommonClientConfigs.BOOTSTRAP_SERVERS_CONFIG),
+                 "localhost:9092,localhost:9092");
+    assertEquals(kafkaProps.getProperty(ProducerConfig.ACKS_CONFIG), "all");
+
+  }
+
+  @Test
+  public void testDefaultTopic() {
+    Sink kafkaSink = new KafkaSink();
+    Context context = prepareDefaultContext();
+    Configurables.configure(kafkaSink, context);
+    Channel memoryChannel = new MemoryChannel();
+    Configurables.configure(memoryChannel, context);
+    kafkaSink.setChannel(memoryChannel);
+    kafkaSink.start();
+
+    String msg = "default-topic-test";
+    Transaction tx = memoryChannel.getTransaction();
+    tx.begin();
+    Event event = EventBuilder.withBody(msg.getBytes());
+    memoryChannel.put(event);
+    tx.commit();
+    tx.close();
+
+    try {
+      Sink.Status status = kafkaSink.process();
+      if (status == Sink.Status.BACKOFF) {
+        fail("Error Occurred");
+      }
+    } catch (EventDeliveryException ex) {
+      // ignore
+    }
+
+    String fetchedMsg = new String((byte[]) testUtil.getNextMessageFromConsumer(DEFAULT_TOPIC)
+                                                    .message());
+    assertEquals(msg, fetchedMsg);
+  }
+
+  @Test
+  public void testStaticTopic() {
+    Context context = prepareDefaultContext();
+    // add the static topic
+    context.put(TOPIC_CONFIG, TestConstants.STATIC_TOPIC);
+    String msg = "static-topic-test";
+
+    try {
+      Sink.Status status = prepareAndSend(context, msg);
+      if (status == Sink.Status.BACKOFF) {
+        fail("Error Occurred");
+      }
+    } catch (EventDeliveryException ex) {
+      // ignore
+    }
+
+    String fetchedMsg = new String((byte[]) testUtil.getNextMessageFromConsumer(
+        TestConstants.STATIC_TOPIC).message());
+    assertEquals(msg, fetchedMsg);
+  }
+
+  @Test
+  public void testTopicAndKeyFromHeader() throws UnsupportedEncodingException {
+    Sink kafkaSink = new KafkaSink();
+    Context context = prepareDefaultContext();
+    Configurables.configure(kafkaSink, context);
+    Channel memoryChannel = new MemoryChannel();
+    Configurables.configure(memoryChannel, context);
+    kafkaSink.setChannel(memoryChannel);
+    kafkaSink.start();
+
+    String msg = "test-topic-and-key-from-header";
+    Map<String, String> headers = new HashMap<String, String>();
+    headers.put("topic", TestConstants.CUSTOM_TOPIC);
+    headers.put("key", TestConstants.CUSTOM_KEY);
+    Transaction tx = memoryChannel.getTransaction();
+    tx.begin();
+    Event event = EventBuilder.withBody(msg.getBytes(), headers);
+    memoryChannel.put(event);
+    tx.commit();
+    tx.close();
+
+    try {
+      Sink.Status status = kafkaSink.process();
+      if (status == Sink.Status.BACKOFF) {
+        fail("Error Occurred");
+      }
+    } catch (EventDeliveryException ex) {
+      // ignore
+    }
+
+    MessageAndMetadata fetchedMsg =
+        testUtil.getNextMessageFromConsumer(TestConstants.CUSTOM_TOPIC);
+
+    assertEquals(msg, new String((byte[]) fetchedMsg.message(), "UTF-8"));
+    assertEquals(TestConstants.CUSTOM_KEY,
+                 new String((byte[]) fetchedMsg.key(), "UTF-8"));
+  }
+
+  @SuppressWarnings("rawtypes")
+  @Test
+  public void testAvroEvent() throws IOException {
+    Sink kafkaSink = new KafkaSink();
+    Context context = prepareDefaultContext();
+    context.put(AVRO_EVENT, "true");
+    Configurables.configure(kafkaSink, context);
+    Channel memoryChannel = new MemoryChannel();
+    Configurables.configure(memoryChannel, context);
+    kafkaSink.setChannel(memoryChannel);
+    kafkaSink.start();
+
+    String msg = "test-avro-event";
+
+    Map<String, String> headers = new HashMap<String, String>();
+    headers.put("topic", TestConstants.CUSTOM_TOPIC);
+    headers.put("key", TestConstants.CUSTOM_KEY);
+    headers.put(TestConstants.HEADER_1_KEY, TestConstants.HEADER_1_VALUE);
+    Transaction tx = memoryChannel.getTransaction();
+    tx.begin();
+    Event event = EventBuilder.withBody(msg.getBytes(), headers);
+    memoryChannel.put(event);
+    tx.commit();
+    tx.close();
+
+    try {
+      Sink.Status status = kafkaSink.process();
+      if (status == Sink.Status.BACKOFF) {
+        fail("Error Occurred");
+      }
+    } catch (EventDeliveryException ex) {
+      // ignore
+    }
+
+    MessageAndMetadata fetchedMsg = testUtil.getNextMessageFromConsumer(TestConstants.CUSTOM_TOPIC);
+
+    ByteArrayInputStream in = new ByteArrayInputStream((byte[]) fetchedMsg.message());
+    BinaryDecoder decoder = DecoderFactory.get().directBinaryDecoder(in, null);
+    SpecificDatumReader<AvroFlumeEvent> reader =
+        new SpecificDatumReader<AvroFlumeEvent>(AvroFlumeEvent.class);
+
+    AvroFlumeEvent avroevent = reader.read(null, decoder);
+
+    String eventBody = new String(avroevent.getBody().array(), Charsets.UTF_8);
+    Map<CharSequence, CharSequence> eventHeaders = avroevent.getHeaders();
+
+    assertEquals(msg, eventBody);
+    assertEquals(TestConstants.CUSTOM_KEY, new String((byte[]) fetchedMsg.key(), "UTF-8"));
+
+    assertEquals(TestConstants.HEADER_1_VALUE,
+                 eventHeaders.get(new Utf8(TestConstants.HEADER_1_KEY)).toString());
+    assertEquals(TestConstants.CUSTOM_KEY, eventHeaders.get(new Utf8("key")).toString());
+  }
+
+  @Test
+  public void testEmptyChannel() throws UnsupportedEncodingException, EventDeliveryException {
+    Sink kafkaSink = new KafkaSink();
+    Context context = prepareDefaultContext();
+    Configurables.configure(kafkaSink, context);
+    Channel memoryChannel = new MemoryChannel();
+    Configurables.configure(memoryChannel, context);
+    kafkaSink.setChannel(memoryChannel);
+    kafkaSink.start();
+
+    Sink.Status status = kafkaSink.process();
+    if (status != Sink.Status.BACKOFF) {
+      fail("Error Occurred");
+    }
+    assertNull(testUtil.getNextMessageFromConsumer(DEFAULT_TOPIC));
+  }
+
+  @Test
+  public void testPartitionHeaderSet() throws Exception {
+    doPartitionHeader(PartitionTestScenario.PARTITION_ID_HEADER_ONLY);
+  }
+
+  @Test
+  public void testPartitionHeaderNotSet() throws Exception {
+    doPartitionHeader(PartitionTestScenario.NO_PARTITION_HEADERS);
+  }
+
+  @Test
+  public void testStaticPartitionAndHeaderSet() throws Exception {
+    doPartitionHeader(PartitionTestScenario.STATIC_HEADER_AND_PARTITION_ID);
+  }
+
+  @Test
+  public void testStaticPartitionHeaderNotSet() throws Exception {
+    doPartitionHeader(PartitionTestScenario.STATIC_HEADER_ONLY);
+  }
+
+  @Test
+  public void testPartitionHeaderMissing() throws Exception {
+    doPartitionErrors(PartitionOption.NOTSET);
+  }
+
+  @Test(expected = org.apache.flume.EventDeliveryException.class)
+  public void testPartitionHeaderOutOfRange() throws Exception {
+    doPartitionErrors(PartitionOption.VALIDBUTOUTOFRANGE);
+  }
+
+  @Test(expected = org.apache.flume.EventDeliveryException.class)
+  public void testPartitionHeaderInvalid() throws Exception {
+    doPartitionErrors(PartitionOption.NOTANUMBER);
+  }
+
+  /**
+   * This function tests three scenarios:
+   * 1. PartitionOption.VALIDBUTOUTOFRANGE: An integer partition is provided,
+   *    however it exceeds the number of partitions available on the topic.
+   *    Expected behaviour: ChannelException thrown.
+   *
+   * 2. PartitionOption.NOTSET: The partition header is not actually set.
+   *    Expected behaviour: Exception is not thrown because the code avoids an NPE.
+   *
+   * 3. PartitionOption.NOTANUMBER: The partition header is set, but is not an Integer.
+   *    Expected behaviour: ChannelExeption thrown.
+   *
+   * @param option
+   * @throws Exception
+   */
+  private void doPartitionErrors(PartitionOption option) throws Exception {
+    Sink kafkaSink = new KafkaSink();
+    Context context = prepareDefaultContext();
+    context.put(KafkaSinkConstants.PARTITION_HEADER_NAME, "partition-header");
+
+    Configurables.configure(kafkaSink, context);
+    Channel memoryChannel = new MemoryChannel();
+    Configurables.configure(memoryChannel, context);
+    kafkaSink.setChannel(memoryChannel);
+    kafkaSink.start();
+
+    String topic = findUnusedTopic();
+    createTopic(topic, 5);
+
+    Transaction tx = memoryChannel.getTransaction();
+    tx.begin();
+
+    Map<String, String> headers = new HashMap<String, String>();
+    headers.put("topic", topic);
+    switch (option) {
+      case VALIDBUTOUTOFRANGE:
+        headers.put("partition-header", "9");
+        break;
+      case NOTSET:
+        headers.put("wrong-header", "2");
+        break;
+      case NOTANUMBER:
+        headers.put("partition-header", "not-a-number");
+        break;
+      default:
+        break;
+    }
+
+    Event event = EventBuilder.withBody(String.valueOf(9).getBytes(), headers);
+
+    memoryChannel.put(event);
+    tx.commit();
+    tx.close();
+
+    Sink.Status status = kafkaSink.process();
+    assertEquals(Sink.Status.READY, status);
+
+    deleteTopic(topic);
+
+  }
+
+  /**
+   * This method tests both the default behavior (usePartitionHeader=false)
+   * and the behaviour when the partitionId setting is used.
+   * Under the default behaviour, one would expect an even distribution of
+   * messages to partitions, however when partitionId is used we manually create
+   * a large skew to some partitions and then verify that this actually happened
+   * by reading messages directly using a Kafka Consumer.
+   *
+   * @param usePartitionHeader
+   * @param staticPtn
+   * @throws Exception
+   */
+  private void doPartitionHeader(PartitionTestScenario scenario) throws Exception {
+    final int numPtns = 5;
+    final int numMsgs = numPtns * 10;
+    final Integer staticPtn = 3;
+
+    String topic = findUnusedTopic();
+    createTopic(topic, numPtns);
+    Context context = prepareDefaultContext();
+    context.put(BATCH_SIZE, "100");
+
+    if (scenario == PartitionTestScenario.PARTITION_ID_HEADER_ONLY ||
+        scenario == PartitionTestScenario.STATIC_HEADER_AND_PARTITION_ID) {
+      context.put(KafkaSinkConstants.PARTITION_HEADER_NAME,
+                  KafkaPartitionTestUtil.PARTITION_HEADER);
+    }
+    if (scenario == PartitionTestScenario.STATIC_HEADER_AND_PARTITION_ID ||
+        scenario == PartitionTestScenario.STATIC_HEADER_ONLY) {
+      context.put(KafkaSinkConstants.STATIC_PARTITION_CONF, staticPtn.toString());
+    }
+    Sink kafkaSink = new KafkaSink();
+
+    Configurables.configure(kafkaSink, context);
+    Channel memoryChannel = new MemoryChannel();
+    Configurables.configure(memoryChannel, context);
+    kafkaSink.setChannel(memoryChannel);
+    kafkaSink.start();
+
+    //Create a map of PartitionId:List<Messages> according to the desired distribution
+    Map<Integer, List<Event>> partitionMap = new HashMap<Integer, List<Event>>(numPtns);
+    for (int i = 0; i < numPtns; i++) {
+      partitionMap.put(i, new ArrayList<Event>());
+    }
+    Transaction tx = memoryChannel.getTransaction();
+    tx.begin();
+
+    List<Event> orderedEvents = KafkaPartitionTestUtil.generateSkewedMessageList(scenario, numMsgs,
+                                                                 partitionMap, numPtns, staticPtn);
+
+    for (Event event : orderedEvents) {
+      event.getHeaders().put("topic", topic);
+      memoryChannel.put(event);
+    }
+
+    tx.commit();
+    tx.close();
+
+    Sink.Status status = kafkaSink.process();
+    assertEquals(Sink.Status.READY, status);
+
+    Properties props = new Properties();
+    props.put("bootstrap.servers", testUtil.getKafkaServerUrl());
+    props.put("group.id", "group_1");
+    props.put("enable.auto.commit", "true");
+    props.put("auto.commit.interval.ms", "1000");
+    props.put("session.timeout.ms", "30000");
+    props.put("key.deserializer",
+        "org.apache.kafka.common.serialization.StringDeserializer");
+    props.put("value.deserializer",
+        "org.apache.kafka.common.serialization.ByteArrayDeserializer");
+    props.put("auto.offset.reset", "earliest");
+    Map<Integer, List<byte[]>> resultsMap =
+        KafkaPartitionTestUtil.retrieveRecordsFromPartitions(topic, numPtns, props);
+
+    KafkaPartitionTestUtil.checkResultsAgainstSkew(scenario, partitionMap, resultsMap, staticPtn,
+                                                   numMsgs);
+
+    memoryChannel.stop();
+    kafkaSink.stop();
+    deleteTopic(topic);
+
+  }
+
+  private Context prepareDefaultContext() {
+    // Prepares a default context with Kafka Server Properties
+    Context context = new Context();
+    context.put(BOOTSTRAP_SERVERS_CONFIG, testUtil.getKafkaServerUrl());
+    context.put(BATCH_SIZE, "1");
+    return context;
+  }
+
+  private Sink.Status prepareAndSend(Context context, String msg)
+      throws EventDeliveryException {
+    Sink kafkaSink = new KafkaSink();
+    Configurables.configure(kafkaSink, context);
+    Channel memoryChannel = new MemoryChannel();
+    Configurables.configure(memoryChannel, context);
+    kafkaSink.setChannel(memoryChannel);
+    kafkaSink.start();
+
+    Transaction tx = memoryChannel.getTransaction();
+    tx.begin();
+    Event event = EventBuilder.withBody(msg.getBytes());
+    memoryChannel.put(event);
+    tx.commit();
+    tx.close();
+
+    return kafkaSink.process();
+  }
+
+  public static void createTopic(String topicName, int numPartitions) {
+    int sessionTimeoutMs = 10000;
+    int connectionTimeoutMs = 10000;
+    ZkUtils zkUtils =
+        ZkUtils.apply(testUtil.getZkUrl(), sessionTimeoutMs, connectionTimeoutMs, false);
+    int replicationFactor = 1;
+    Properties topicConfig = new Properties();
+    AdminUtils.createTopic(zkUtils, topicName, numPartitions, replicationFactor, topicConfig);
+  }
+
+  public static void deleteTopic(String topicName) {
+    int sessionTimeoutMs = 10000;
+    int connectionTimeoutMs = 10000;
+    ZkUtils zkUtils =
+        ZkUtils.apply(testUtil.getZkUrl(), sessionTimeoutMs, connectionTimeoutMs, false);
+    AdminUtils.deleteTopic(zkUtils, topicName);
+  }
+
+  public String findUnusedTopic() {
+    String newTopic = null;
+    boolean topicFound = false;
+    while (!topicFound) {
+      newTopic = RandomStringUtils.randomAlphabetic(8);
+      if (!usedTopics.contains(newTopic)) {
+        usedTopics.add(newTopic);
+        topicFound = true;
+      }
+    }
+    return newTopic;
+  }
+
+}
diff --git a/code/flume-ng-sinks/flume-ng-kafka-sink/src/test/java/org/apache/flume/sink/kafka/util/KafkaConsumer.java b/code/flume-ng-sinks/flume-ng-kafka-sink/src/test/java/org/apache/flume/sink/kafka/util/KafkaConsumer.java
new file mode 100644
index 0000000..d5dfbd6
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-kafka-sink/src/test/java/org/apache/flume/sink/kafka/util/KafkaConsumer.java
@@ -0,0 +1,98 @@
+/**
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements.  See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License.  You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+ limitations under the License.
+ */
+
+package org.apache.flume.sink.kafka.util;
+
+import kafka.consumer.ConsumerConfig;
+import kafka.consumer.ConsumerIterator;
+import kafka.consumer.ConsumerTimeoutException;
+import kafka.consumer.KafkaStream;
+import kafka.javaapi.consumer.ConsumerConnector;
+import kafka.message.MessageAndMetadata;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Properties;
+
+/**
+ * A Kafka Consumer implementation. This uses the current thread to fetch the
+ * next message from the queue and doesn't use a multi threaded implementation.
+ * So this implements a synchronous blocking call.
+ * To avoid infinite waiting, a timeout is implemented to wait only for
+ * 10 seconds before concluding that the message will not be available.
+ */
+public class KafkaConsumer {
+
+  private static final Logger logger = LoggerFactory.getLogger(
+      KafkaConsumer.class);
+
+  private final ConsumerConnector consumer;
+  Map<String, List<KafkaStream<byte[], byte[]>>> consumerMap;
+
+  public KafkaConsumer() {
+    consumer = kafka.consumer.Consumer.createJavaConsumerConnector(
+        createConsumerConfig(TestUtil.getInstance().getZkUrl(), "group_1"));
+  }
+
+  private static ConsumerConfig createConsumerConfig(String zkUrl,
+      String groupId) {
+    Properties props = new Properties();
+    props.put("zookeeper.connect", zkUrl);
+    props.put("group.id", groupId);
+    props.put("zookeeper.session.timeout.ms", "1000");
+    props.put("zookeeper.sync.time.ms", "200");
+    props.put("auto.commit.interval.ms", "1000");
+    props.put("auto.offset.reset", "smallest");
+    props.put("consumer.timeout.ms","1000");
+    return new ConsumerConfig(props);
+  }
+
+  public void initTopicList(List<String> topics) {
+    Map<String, Integer> topicCountMap = new HashMap<String, Integer>();
+    for (String topic : topics) {
+      // we need only single threaded consumers
+      topicCountMap.put(topic, new Integer(1));
+    }
+    consumerMap = consumer.createMessageStreams(topicCountMap);
+  }
+
+  public MessageAndMetadata getNextMessage(String topic) {
+    List<KafkaStream<byte[], byte[]>> streams = consumerMap.get(topic);
+    // it has only a single stream, because there is only one consumer
+    KafkaStream stream = streams.get(0);
+    final ConsumerIterator<byte[], byte[]> it = stream.iterator();
+    int counter = 0;
+    try {
+      if (it.hasNext()) {
+        return it.next();
+      } else {
+        return null;
+      }
+    } catch (ConsumerTimeoutException e) {
+      logger.error("0 messages available to fetch for the topic " + topic);
+      return null;
+    }
+  }
+
+  public void shutdown() {
+    consumer.shutdown();
+  }
+}
diff --git a/code/flume-ng-sinks/flume-ng-kafka-sink/src/test/java/org/apache/flume/sink/kafka/util/KafkaLocal.java b/code/flume-ng-sinks/flume-ng-kafka-sink/src/test/java/org/apache/flume/sink/kafka/util/KafkaLocal.java
new file mode 100644
index 0000000..6d89bd3
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-kafka-sink/src/test/java/org/apache/flume/sink/kafka/util/KafkaLocal.java
@@ -0,0 +1,51 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ * limitations under the License.
+ */
+
+package org.apache.flume.sink.kafka.util;
+
+import kafka.server.KafkaConfig;
+import kafka.server.KafkaServerStartable;
+
+import java.io.IOException;
+import java.util.Properties;
+
+/**
+ * A local Kafka server for running unit tests.
+ * Reference: https://gist.github.com/fjavieralba/7930018/
+ */
+public class KafkaLocal {
+
+  public KafkaServerStartable kafka;
+  public ZooKeeperLocal zookeeper;
+
+  public KafkaLocal(Properties kafkaProperties) throws IOException, InterruptedException {
+    KafkaConfig kafkaConfig = KafkaConfig.fromProps(kafkaProperties);
+
+    // start local kafka broker
+    kafka = new KafkaServerStartable(kafkaConfig);
+  }
+
+  public void start() throws Exception {
+    kafka.startup();
+  }
+
+  public void stop() {
+    kafka.shutdown();
+  }
+
+}
\ No newline at end of file
diff --git a/code/flume-ng-sinks/flume-ng-kafka-sink/src/test/java/org/apache/flume/sink/kafka/util/TestUtil.java b/code/flume-ng-sinks/flume-ng-kafka-sink/src/test/java/org/apache/flume/sink/kafka/util/TestUtil.java
new file mode 100644
index 0000000..6405d6c
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-kafka-sink/src/test/java/org/apache/flume/sink/kafka/util/TestUtil.java
@@ -0,0 +1,175 @@
+/**
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements.  See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License.  You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+ limitations under the License.
+ */
+
+package org.apache.flume.sink.kafka.util;
+
+import kafka.message.MessageAndMetadata;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.net.BindException;
+import java.net.InetAddress;
+import java.net.UnknownHostException;
+import java.util.List;
+import java.util.Properties;
+import java.util.Random;
+
+/**
+ * A utility class for starting/stopping Kafka Server.
+ */
+public class TestUtil {
+
+  private static final Logger logger = LoggerFactory.getLogger(TestUtil.class);
+  private static TestUtil instance = new TestUtil();
+
+  private Random randPortGen = new Random(System.currentTimeMillis());
+  private KafkaLocal kafkaServer;
+  private KafkaConsumer kafkaConsumer;
+  private String hostname = "localhost";
+  private int kafkaLocalPort;
+  private int zkLocalPort;
+
+  private TestUtil() {
+    init();
+  }
+
+  public static TestUtil getInstance() {
+    return instance;
+  }
+
+  private void init() {
+    // get the localhost.
+    try {
+      hostname = InetAddress.getLocalHost().getHostName();
+    } catch (UnknownHostException e) {
+      logger.warn("Error getting the value of localhost. " +
+          "Proceeding with 'localhost'.", e);
+    }
+  }
+
+  private boolean startKafkaServer() {
+    Properties kafkaProperties = new Properties();
+    Properties zkProperties = new Properties();
+
+    logger.info("Starting kafka server.");
+    try {
+      //load properties
+      zkProperties.load(Class.class.getResourceAsStream(
+          "/zookeeper.properties"));
+
+      ZooKeeperLocal zookeeper;
+      while (true) {
+        //start local Zookeeper
+        try {
+          zkLocalPort = getNextPort();
+          // override the Zookeeper client port with the generated one.
+          zkProperties.setProperty("clientPort", Integer.toString(zkLocalPort));
+          zookeeper = new ZooKeeperLocal(zkProperties);
+          break;
+        } catch (BindException bindEx) {
+          // bind exception. port is already in use. Try a different port.
+        }
+      }
+      logger.info("ZooKeeper instance is successfully started on port " +
+          zkLocalPort);
+
+      kafkaProperties.load(Class.class.getResourceAsStream(
+          "/kafka-server.properties"));
+      // override the Zookeeper url.
+      kafkaProperties.setProperty("zookeeper.connect", getZkUrl());
+      while (true) {
+        kafkaLocalPort = getNextPort();
+        // override the Kafka server port
+        kafkaProperties.setProperty("port", Integer.toString(kafkaLocalPort));
+        kafkaServer = new KafkaLocal(kafkaProperties);
+        try {
+          kafkaServer.start();
+          break;
+        } catch (BindException bindEx) {
+          // let's try another port.
+        }
+      }
+      logger.info("Kafka Server is successfully started on port " +
+          kafkaLocalPort);
+      return true;
+
+    } catch (Exception e) {
+      logger.error("Error starting the Kafka Server.", e);
+      return false;
+    }
+  }
+
+  private KafkaConsumer getKafkaConsumer() {
+    synchronized (this) {
+      if (kafkaConsumer == null) {
+        kafkaConsumer = new KafkaConsumer();
+      }
+    }
+    return kafkaConsumer;
+  }
+
+  public void initTopicList(List<String> topics) {
+    getKafkaConsumer().initTopicList(topics);
+  }
+
+  public MessageAndMetadata getNextMessageFromConsumer(String topic) {
+    return getKafkaConsumer().getNextMessage(topic);
+  }
+
+  public void prepare() {
+    boolean startStatus = startKafkaServer();
+    if (!startStatus) {
+      throw new RuntimeException("Error starting the server!");
+    }
+    try {
+      Thread.sleep(3 * 1000);   // add this sleep time to
+      // ensure that the server is fully started before proceeding with tests.
+    } catch (InterruptedException e) {
+      // ignore
+    }
+    getKafkaConsumer();
+    logger.info("Completed the prepare phase.");
+  }
+
+  public void tearDown() {
+    logger.info("Shutting down the Kafka Consumer.");
+    getKafkaConsumer().shutdown();
+    try {
+      Thread.sleep(3 * 1000);   // add this sleep time to
+      // ensure that the server is fully started before proceeding with tests.
+    } catch (InterruptedException e) {
+      // ignore
+    }
+    logger.info("Shutting down the kafka Server.");
+    kafkaServer.stop();
+    logger.info("Completed the tearDown phase.");
+  }
+
+  private synchronized int getNextPort() {
+    // generate a random port number between 49152 and 65535
+    return randPortGen.nextInt(65535 - 49152) + 49152;
+  }
+
+  public String getZkUrl() {
+    return hostname + ":" + zkLocalPort;
+  }
+
+  public String getKafkaServerUrl() {
+    return hostname + ":" + kafkaLocalPort;
+  }
+}
diff --git a/code/flume-ng-sinks/flume-ng-kafka-sink/src/test/java/org/apache/flume/sink/kafka/util/ZooKeeperLocal.java b/code/flume-ng-sinks/flume-ng-kafka-sink/src/test/java/org/apache/flume/sink/kafka/util/ZooKeeperLocal.java
new file mode 100644
index 0000000..35c1e47
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-kafka-sink/src/test/java/org/apache/flume/sink/kafka/util/ZooKeeperLocal.java
@@ -0,0 +1,61 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ * limitations under the License.
+ */
+
+package org.apache.flume.sink.kafka.util;
+
+import org.apache.zookeeper.server.ServerConfig;
+import org.apache.zookeeper.server.ZooKeeperServerMain;
+import org.apache.zookeeper.server.quorum.QuorumPeerConfig;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.Properties;
+
+/**
+ * A local Zookeeper server for running unit tests.
+ * Reference: https://gist.github.com/fjavieralba/7930018/
+ */
+public class ZooKeeperLocal {
+
+  private static final Logger logger = LoggerFactory.getLogger(ZooKeeperLocal.class);
+  private ZooKeeperServerMain zooKeeperServer;
+
+  public ZooKeeperLocal(Properties zkProperties) throws IOException {
+    QuorumPeerConfig quorumConfiguration = new QuorumPeerConfig();
+    try {
+      quorumConfiguration.parseProperties(zkProperties);
+    } catch (Exception e) {
+      throw new RuntimeException(e);
+    }
+
+    zooKeeperServer = new ZooKeeperServerMain();
+    final ServerConfig configuration = new ServerConfig();
+    configuration.readFrom(quorumConfiguration);
+
+    new Thread() {
+      public void run() {
+        try {
+          zooKeeperServer.runFromConfig(configuration);
+        } catch (IOException e) {
+          logger.error("Zookeeper startup failed.", e);
+        }
+      }
+    }.start();
+  }
+}
diff --git a/code/flume-ng-sinks/flume-ng-kafka-sink/src/test/resources/kafka-server.properties b/code/flume-ng-sinks/flume-ng-kafka-sink/src/test/resources/kafka-server.properties
new file mode 100644
index 0000000..02a81e2
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-kafka-sink/src/test/resources/kafka-server.properties
@@ -0,0 +1,118 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+# 
+#    http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# see kafka.server.KafkaConfig for additional details and defaults
+
+############################# Server Basics #############################
+
+# The id of the broker. This must be set to a unique integer for each broker.
+broker.id=0
+
+############################# Socket Server Settings #############################
+
+# The port the socket server listens on
+port=9092
+
+# Hostname the broker will bind to. If not set, the server will bind to all interfaces
+#host.name=localhost
+
+# Hostname the broker will advertise to producers and consumers. If not set, it uses the
+# value for "host.name" if configured.  Otherwise, it will use the value returned from
+# java.net.InetAddress.getCanonicalHostName().
+#advertised.host.name=<hostname routable by clients>
+
+# The port to publish to ZooKeeper for clients to use. If this is not set,
+# it will publish the same port that the broker binds to.
+#advertised.port=<port accessible by clients>
+
+# The number of threads handling network requests
+num.network.threads=2
+ 
+# The number of threads doing disk I/O
+num.io.threads=8
+
+# The send buffer (SO_SNDBUF) used by the socket server
+socket.send.buffer.bytes=1048576
+
+# The receive buffer (SO_RCVBUF) used by the socket server
+socket.receive.buffer.bytes=1048576
+
+# The maximum size of a request that the socket server will accept (protection against OOM)
+socket.request.max.bytes=104857600
+
+
+############################# Log Basics #############################
+
+# A comma seperated list of directories under which to store log files
+log.dirs=target/kafka-logs
+
+# The default number of log partitions per topic. More partitions allow greater
+# parallelism for consumption, but this will also result in more files across
+# the brokers.
+num.partitions=2
+
+############################# Log Flush Policy #############################
+
+# Messages are immediately written to the filesystem but by default we only fsync() to sync
+# the OS cache lazily. The following configurations control the flush of data to disk. 
+# There are a few important trade-offs here:
+#    1. Durability: Unflushed data may be lost if you are not using replication.
+#    2. Latency: Very large flush intervals may lead to latency spikes when the flush does occur as there will be a lot of data to flush.
+#    3. Throughput: The flush is generally the most expensive operation, and a small flush interval may lead to exceessive seeks. 
+# The settings below allow one to configure the flush policy to flush data after a period of time or
+# every N messages (or both). This can be done globally and overridden on a per-topic basis.
+
+# The number of messages to accept before forcing a flush of data to disk
+#log.flush.interval.messages=10000
+
+# The maximum amount of time a message can sit in a log before we force a flush
+#log.flush.interval.ms=1000
+
+############################# Log Retention Policy #############################
+
+# The following configurations control the disposal of log segments. The policy can
+# be set to delete segments after a period of time, or after a given size has accumulated.
+# A segment will be deleted whenever *either* of these criteria are met. Deletion always happens
+# from the end of the log.
+
+# The minimum age of a log file to be eligible for deletion
+log.retention.hours=168
+
+# A size-based retention policy for logs. Segments are pruned from the log as long as the remaining
+# segments don't drop below log.retention.bytes.
+#log.retention.bytes=1073741824
+
+# The maximum size of a log segment file. When this size is reached a new log segment will be created.
+log.segment.bytes=536870912
+
+# The interval at which log segments are checked to see if they can be deleted according 
+# to the retention policies
+log.retention.check.interval.ms=60000
+
+# By default the log cleaner is disabled and the log retention policy will default to just delete segments after their retention expires.
+# If log.cleaner.enable=true is set the cleaner will be enabled and individual logs can then be marked for log compaction.
+log.cleaner.enable=false
+
+############################# Zookeeper #############################
+
+# Zookeeper connection string (see zookeeper docs for details).
+# This is a comma separated host:port pairs, each corresponding to a zk
+# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002".
+# You can also append an optional chroot string to the urls to specify the
+# root directory for all kafka znodes.
+zookeeper.connect=localhost:2181
+
+# Timeout in ms for connecting to zookeeper
+zookeeper.connection.timeout.ms=1000000
diff --git a/code/flume-ng-sinks/flume-ng-kafka-sink/src/test/resources/log4j.properties b/code/flume-ng-sinks/flume-ng-kafka-sink/src/test/resources/log4j.properties
new file mode 100644
index 0000000..b86600b
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-kafka-sink/src/test/resources/log4j.properties
@@ -0,0 +1,78 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+kafka.logs.dir=target/logs
+
+log4j.rootLogger=INFO, stdout
+
+log4j.appender.stdout=org.apache.log4j.ConsoleAppender
+log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
+log4j.appender.stdout.layout.ConversionPattern=[%d] %p %m (%c)%n
+
+log4j.appender.kafkaAppender=org.apache.log4j.DailyRollingFileAppender
+log4j.appender.kafkaAppender.DatePattern='.'yyyy-MM-dd-HH
+log4j.appender.kafkaAppender.File=${kafka.logs.dir}/server.log
+log4j.appender.kafkaAppender.layout=org.apache.log4j.PatternLayout
+log4j.appender.kafkaAppender.layout.ConversionPattern=[%d] %p %m (%c)%n
+
+log4j.appender.stateChangeAppender=org.apache.log4j.DailyRollingFileAppender
+log4j.appender.stateChangeAppender.DatePattern='.'yyyy-MM-dd-HH
+log4j.appender.stateChangeAppender.File=${kafka.logs.dir}/state-change.log
+log4j.appender.stateChangeAppender.layout=org.apache.log4j.PatternLayout
+log4j.appender.stateChangeAppender.layout.ConversionPattern=[%d] %p %m (%c)%n
+
+log4j.appender.requestAppender=org.apache.log4j.DailyRollingFileAppender
+log4j.appender.requestAppender.DatePattern='.'yyyy-MM-dd-HH
+log4j.appender.requestAppender.File=${kafka.logs.dir}/kafka-request.log
+log4j.appender.requestAppender.layout=org.apache.log4j.PatternLayout
+log4j.appender.requestAppender.layout.ConversionPattern=[%d] %p %m (%c)%n
+
+log4j.appender.cleanerAppender=org.apache.log4j.DailyRollingFileAppender
+log4j.appender.cleanerAppender.DatePattern='.'yyyy-MM-dd-HH
+log4j.appender.cleanerAppender.File=${kafka.logs.dir}/log-cleaner.log
+log4j.appender.cleanerAppender.layout=org.apache.log4j.PatternLayout
+log4j.appender.cleanerAppender.layout.ConversionPattern=[%d] %p %m (%c)%n
+
+log4j.appender.controllerAppender=org.apache.log4j.DailyRollingFileAppender
+log4j.appender.controllerAppender.DatePattern='.'yyyy-MM-dd-HH
+log4j.appender.controllerAppender.File=${kafka.logs.dir}/controller.log
+log4j.appender.controllerAppender.layout=org.apache.log4j.PatternLayout
+log4j.appender.controllerAppender.layout.ConversionPattern=[%d] %p %m (%c)%n
+
+# Turn on all our debugging info
+#log4j.logger.kafka.producer.async.DefaultEventHandler=DEBUG, kafkaAppender
+#log4j.logger.kafka.client.ClientUtils=DEBUG, kafkaAppender
+#log4j.logger.kafka.perf=DEBUG, kafkaAppender
+#log4j.logger.kafka.perf.ProducerPerformance$ProducerThread=DEBUG, kafkaAppender
+#log4j.logger.org.I0Itec.zkclient.ZkClient=DEBUG
+log4j.logger.kafka=INFO, kafkaAppender
+
+log4j.logger.kafka.network.RequestChannel$=WARN, requestAppender
+log4j.additivity.kafka.network.RequestChannel$=false
+
+#log4j.logger.kafka.network.Processor=TRACE, requestAppender
+#log4j.logger.kafka.server.KafkaApis=TRACE, requestAppender
+#log4j.additivity.kafka.server.KafkaApis=false
+log4j.logger.kafka.request.logger=WARN, requestAppender
+log4j.additivity.kafka.request.logger=false
+
+log4j.logger.kafka.controller=TRACE, controllerAppender
+log4j.additivity.kafka.controller=false
+
+log4j.logger.kafka.log.LogCleaner=INFO, cleanerAppender
+log4j.additivity.kafka.log.LogCleaner=false
+
+log4j.logger.state.change.logger=TRACE, stateChangeAppender
+log4j.additivity.state.change.logger=false
diff --git a/code/flume-ng-sinks/flume-ng-kafka-sink/src/test/resources/zookeeper.properties b/code/flume-ng-sinks/flume-ng-kafka-sink/src/test/resources/zookeeper.properties
new file mode 100644
index 0000000..89e1b5e
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-kafka-sink/src/test/resources/zookeeper.properties
@@ -0,0 +1,20 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# the directory where the snapshot is stored.
+dataDir=target
+# the port at which the clients will connect
+clientPort=2181
+# disable the per-ip limit on the number of connections since this is a non-production config
+maxClientCnxns=0
\ No newline at end of file
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/README.md b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/README.md
new file mode 100644
index 0000000..ede3ab7
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/README.md
@@ -0,0 +1,20 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#  http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# Flume Morphline Solr Sink
+
+This module contains a Flume Morphline Solr Sink that extracts search documents from Flume events, transforms them and loads them in Near Real Time into Apache Solr, typically a SolrCloud. This sink is intended to be used alongside the HdfsSink. It is designed to process not just structured data, but also arbitrary raw data, including data from many heterogeneous data sources.
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/pom.xml b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/pom.xml
new file mode 100644
index 0000000..055c2c2
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/pom.xml
@@ -0,0 +1,139 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to You under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
+  <modelVersion>4.0.0</modelVersion>
+
+  <parent>
+    <artifactId>flume-ng-sinks</artifactId>
+    <groupId>org.apache.flume</groupId>
+    <version>1.7.0</version>
+  </parent>
+
+  <groupId>org.apache.flume.flume-ng-sinks</groupId>
+  <artifactId>flume-ng-morphline-solr-sink</artifactId>
+  <version>1.7.0</version>
+  <name>Flume NG Morphline Solr Sink</name>
+
+  <properties>
+    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
+    <solr.version>4.3.0</solr.version>
+    <solr.expected.version>4.3.0</solr.expected.version> <!-- sanity check to verify we actually run against the expected version rather than some outdated version -->
+    <slf4j.version>1.6.1</slf4j.version>
+    <surefire.version>2.12.4</surefire.version>
+  </properties>
+
+  <dependencies>
+
+    <dependency>
+      <groupId>org.apache.flume</groupId>
+      <artifactId>flume-ng-core</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>org.slf4j</groupId>
+      <artifactId>slf4j-api</artifactId>
+    </dependency>
+
+    <dependency>
+      <groupId>org.kitesdk</groupId>
+      <artifactId>kite-morphlines-all</artifactId>
+      <version>${kite.version}</version>
+      <exclusions>
+        <exclusion>
+            <groupId>org.apache.hadoop</groupId>
+           <artifactId>hadoop-common</artifactId>
+        </exclusion>
+      </exclusions>
+      <type>pom</type>
+      <optional>true</optional>
+    </dependency>
+
+    <dependency>
+      <groupId>org.slf4j</groupId>
+      <artifactId>jcl-over-slf4j</artifactId>
+      <version>${slf4j.version}</version> <!-- flume provides 1.7.2 and solr depends on 1.6.4 -->
+      <scope>provided</scope>
+    </dependency>
+
+    <dependency> <!-- see http://lucene.apache.org/solr -->
+      <groupId>org.apache.solr</groupId>
+      <artifactId>solr-test-framework</artifactId>
+      <version>${solr.version}</version>
+      <scope>test</scope>
+      <exclusions>
+        <exclusion>
+          <groupId>org.slf4j</groupId>
+          <artifactId>slf4j-jdk14</artifactId> <!-- instead use slf4j on top of log4j or logback  -->
+        </exclusion>
+      </exclusions>
+    </dependency>
+
+    <dependency>
+      <groupId>org.kitesdk</groupId>
+      <artifactId>kite-morphlines-solr-core</artifactId>
+      <version>${kite.version}</version>
+      <type>test-jar</type>
+      <scope>test</scope>
+    </dependency>
+
+    <dependency>
+      <groupId>junit</groupId>
+      <artifactId>junit</artifactId>
+      <scope>test</scope>
+    </dependency>
+
+  </dependencies>
+
+  <build>
+    <plugins>
+
+      <plugin>
+        <groupId>org.apache.maven.plugins</groupId>
+        <artifactId>maven-surefire-plugin</artifactId>
+        <version>${surefire.version}</version>
+        <configuration>
+          <argLine>-Dtests.locale=en_us</argLine>
+          <redirectTestOutputToFile>true</redirectTestOutputToFile>
+          <systemPropertyVariables>
+            <!--<solr.expected.version>${solr.expected.version}</solr.expected.version>-->
+          </systemPropertyVariables>
+        </configuration>
+      </plugin>
+
+      <plugin>
+        <groupId>org.apache.rat</groupId>
+        <artifactId>apache-rat-plugin</artifactId>
+        <executions>
+          <execution>
+            <id>test.rat</id>
+            <phase>test</phase>
+            <goals>
+              <goal>check</goal>
+            </goals>
+            <configuration>
+              <excludes>
+                <exclude>src/test/resources/**</exclude>
+              </excludes>
+            </configuration>
+          </execution>
+        </executions>
+      </plugin>
+
+    </plugins>
+  </build>
+</project>
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobDeserializer.java b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobDeserializer.java
new file mode 100644
index 0000000..095f889
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobDeserializer.java
@@ -0,0 +1,162 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flume.sink.solr.morphline;
+
+import java.io.IOException;
+import java.util.List;
+
+import org.apache.commons.io.output.ByteArrayOutputStream;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.annotations.InterfaceAudience;
+import org.apache.flume.annotations.InterfaceStability;
+import org.apache.flume.conf.ConfigurationException;
+import org.apache.flume.event.EventBuilder;
+import org.apache.flume.serialization.EventDeserializer;
+import org.apache.flume.serialization.ResettableInputStream;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import com.google.common.collect.Lists;
+
+/**
+ * A deserializer that reads a Binary Large Object (BLOB) per event, typically
+ * one BLOB per file; To be used in conjunction with Flume SpoolDirectorySource.
+ * <p>
+ * Note that this approach is not suitable for very large objects because it
+ * buffers up the entire BLOB.
+ */
+@InterfaceAudience.Private
+@InterfaceStability.Evolving
+public class BlobDeserializer implements EventDeserializer {
+
+  private ResettableInputStream in;
+  private final int maxBlobLength;
+  private volatile boolean isOpen;
+
+  public static final String MAX_BLOB_LENGTH_KEY = "maxBlobLength";
+  public static final int MAX_BLOB_LENGTH_DEFAULT = 100 * 1000 * 1000;
+
+  private static final int DEFAULT_BUFFER_SIZE = 1024 * 8;
+  private static final Logger LOGGER = LoggerFactory.getLogger(BlobDeserializer.class);
+      
+  protected BlobDeserializer(Context context, ResettableInputStream in) {
+    this.in = in;
+    this.maxBlobLength = context.getInteger(MAX_BLOB_LENGTH_KEY, MAX_BLOB_LENGTH_DEFAULT);
+    if (this.maxBlobLength <= 0) {
+      throw new ConfigurationException("Configuration parameter " + MAX_BLOB_LENGTH_KEY
+          + " must be greater than zero: " + maxBlobLength);
+    }
+    this.isOpen = true;
+  }
+
+  /**
+   * Reads a BLOB from a file and returns an event
+   * @return Event containing a BLOB
+   * @throws IOException
+   */
+  @SuppressWarnings("resource")
+  @Override
+  public Event readEvent() throws IOException {
+    ensureOpen();
+    ByteArrayOutputStream blob = null;
+    byte[] buf = new byte[Math.min(maxBlobLength, DEFAULT_BUFFER_SIZE)];
+    int blobLength = 0;
+    int n = 0;
+    while ((n = in.read(buf, 0, Math.min(buf.length, maxBlobLength - blobLength))) != -1) {
+      if (blob == null) {
+        blob = new ByteArrayOutputStream(n);
+      }
+      blob.write(buf, 0, n);
+      blobLength += n;
+      if (blobLength >= maxBlobLength) {
+        LOGGER.warn("File length exceeds maxBlobLength ({}), truncating BLOB event!",
+                    maxBlobLength);
+        break;
+      }
+    }
+    
+    if (blob == null) {
+      return null;
+    } else {
+      return EventBuilder.withBody(blob.toByteArray());
+    }
+  }
+  
+  /**
+   * Batch BLOB read
+   * @param numEvents Maximum number of events to return.
+   * @return List of events containing read BLOBs
+   * @throws IOException
+   */
+  @Override
+  public List<Event> readEvents(int numEvents) throws IOException {
+    ensureOpen();
+    List<Event> events = Lists.newLinkedList();
+    for (int i = 0; i < numEvents; i++) {
+      Event event = readEvent();
+      if (event != null) {
+        events.add(event);
+      } else {
+        break;
+      }
+    }
+    return events;
+  }
+
+  @Override
+  public void mark() throws IOException {
+    ensureOpen();
+    in.mark();
+  }
+
+  @Override
+  public void reset() throws IOException {
+    ensureOpen();
+    in.reset();
+  }
+
+  @Override
+  public void close() throws IOException {
+    if (isOpen) {
+      reset();
+      in.close();
+      isOpen = false;
+    }
+  }
+
+  private void ensureOpen() {
+    if (!isOpen) {
+      throw new IllegalStateException("Serializer has been closed");
+    }
+  }
+
+  
+  ///////////////////////////////////////////////////////////////////////////////
+  // Nested classes:
+  ///////////////////////////////////////////////////////////////////////////////
+  /** Builder implementations MUST have a public no-arg constructor */
+  public static class Builder implements EventDeserializer.Builder {
+
+    @Override
+    public BlobDeserializer build(Context context, ResettableInputStream in) {      
+      return new BlobDeserializer(context, in);
+    }
+
+  }
+
+}
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java
new file mode 100644
index 0000000..fe98746
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java
@@ -0,0 +1,129 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flume.sink.solr.morphline;
+
+import java.io.InputStream;
+import java.util.Collections;
+import java.util.Enumeration;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+import javax.servlet.http.HttpServletRequest;
+
+import org.apache.commons.io.output.ByteArrayOutputStream;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.conf.ConfigurationException;
+import org.apache.flume.conf.LogPrivacyUtil;
+import org.apache.flume.event.EventBuilder;
+import org.apache.flume.source.http.HTTPSourceHandler;
+import org.apache.tika.metadata.Metadata;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * BlobHandler for HTTPSource that returns event that contains the request
+ * parameters as well as the Binary Large Object (BLOB) uploaded with this
+ * request.
+ * <p>
+ * Note that this approach is not suitable for very large objects because it
+ * buffers up the entire BLOB.
+ * <p>
+ * Example client usage:
+ * <pre>
+ * curl --data-binary @sample-statuses-20120906-141433-medium.avro 'http://127.0.0.1:5140?resourceName=sample-statuses-20120906-141433-medium.avro' --header 'Content-Type:application/octet-stream' --verbose
+ * </pre>
+ */
+public class BlobHandler implements HTTPSourceHandler {
+
+  private int maxBlobLength = MAX_BLOB_LENGTH_DEFAULT;
+
+  public static final String MAX_BLOB_LENGTH_KEY = "maxBlobLength";
+  public static final int MAX_BLOB_LENGTH_DEFAULT = 100 * 1000 * 1000;
+
+  private static final int DEFAULT_BUFFER_SIZE = 1024 * 8;
+  private static final Logger LOGGER = LoggerFactory.getLogger(BlobHandler.class);
+
+  public BlobHandler() {
+  }
+
+  @Override
+  public void configure(Context context) {
+    this.maxBlobLength = context.getInteger(MAX_BLOB_LENGTH_KEY, MAX_BLOB_LENGTH_DEFAULT);
+    if (this.maxBlobLength <= 0) {
+      throw new ConfigurationException("Configuration parameter " + MAX_BLOB_LENGTH_KEY
+          + " must be greater than zero: " + maxBlobLength);
+    }
+  }
+
+  @SuppressWarnings("resource")
+  @Override
+  public List<Event> getEvents(HttpServletRequest request) throws Exception {
+    Map<String, String> headers = getHeaders(request);
+    InputStream in = request.getInputStream();
+    try {
+      ByteArrayOutputStream blob = null;
+      byte[] buf = new byte[Math.min(maxBlobLength, DEFAULT_BUFFER_SIZE)];
+      int blobLength = 0;
+      int n = 0;
+      while ((n = in.read(buf, 0, Math.min(buf.length, maxBlobLength - blobLength))) != -1) {
+        if (blob == null) {
+          blob = new ByteArrayOutputStream(n);
+        }
+        blob.write(buf, 0, n);
+        blobLength += n;
+        if (blobLength >= maxBlobLength) {
+          LOGGER.warn("Request length exceeds maxBlobLength ({}), truncating BLOB event!",
+              maxBlobLength);
+          break;
+        }
+      }
+
+      byte[] array = blob != null ? blob.toByteArray() : new byte[0];
+      Event event = EventBuilder.withBody(array, headers);
+      if (LOGGER.isDebugEnabled() && LogPrivacyUtil.allowLogRawData()) {
+        LOGGER.debug("blobEvent: {}", event);
+      }
+      return Collections.singletonList(event);
+    } finally {
+      in.close();
+    }
+  }
+
+  private Map<String, String> getHeaders(HttpServletRequest request) {
+    if (LOGGER.isDebugEnabled() && LogPrivacyUtil.allowLogRawData()) {
+      Map requestHeaders = new HashMap();
+      Enumeration iter = request.getHeaderNames();
+      while (iter.hasMoreElements()) {
+        String name = (String) iter.nextElement();
+        requestHeaders.put(name, request.getHeader(name));
+      }
+      LOGGER.debug("requestHeaders: {}", requestHeaders);
+    }
+    Map<String, String> headers = new HashMap();
+    if (request.getContentType() != null) {
+      headers.put(Metadata.CONTENT_TYPE, request.getContentType());
+    }
+    Enumeration iter = request.getParameterNames();
+    while (iter.hasMoreElements()) {
+      String name = (String) iter.nextElement();
+      headers.put(name, request.getParameter(name));
+    }
+    return headers;
+  }
+}
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineHandler.java b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineHandler.java
new file mode 100644
index 0000000..bb5191d
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineHandler.java
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flume.sink.solr.morphline;
+
+import java.io.IOException;
+
+import org.apache.flume.Event;
+import org.apache.flume.conf.Configurable;
+
+/**
+ * Interface to load Flume events into Solr
+ */
+public interface MorphlineHandler extends Configurable {
+
+  /** Begins a transaction */
+  public void beginTransaction();
+
+  /** Loads the given event into Solr */
+  public void process(Event event);
+
+  /**
+   * Sends any outstanding documents to Solr and waits for a positive
+   * or negative ack (i.e. exception). Depending on the outcome the caller
+   * should then commit or rollback the current flume transaction
+   * correspondingly.
+   * 
+   * @throws IOException
+   *           If there is a low-level I/O error.
+   */
+  public void commitTransaction();
+
+  /**
+   * Performs a rollback of all non-committed documents pending.
+   * <p>
+   * Note that this is not a true rollback as in databases. Content you have previously added to
+   * Solr may have already been committed due to autoCommit, buffer full, other client performing a
+   * commit etc. So this is only a best-effort rollback.
+   * 
+   * @throws IOException
+   *           If there is a low-level I/O error.
+   */
+  public void rollbackTransaction();
+
+  /** Releases allocated resources */
+  public void stop();
+
+
+}
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineHandlerImpl.java b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineHandlerImpl.java
new file mode 100644
index 0000000..d877814
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineHandlerImpl.java
@@ -0,0 +1,163 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flume.sink.solr.morphline;
+
+import java.io.File;
+import java.util.Map.Entry;
+
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.kitesdk.morphline.api.Command;
+import org.kitesdk.morphline.api.MorphlineCompilationException;
+import org.kitesdk.morphline.api.MorphlineContext;
+import org.kitesdk.morphline.api.Record;
+import org.kitesdk.morphline.base.Compiler;
+import org.kitesdk.morphline.base.FaultTolerance;
+import org.kitesdk.morphline.base.Fields;
+import org.kitesdk.morphline.base.Metrics;
+import org.kitesdk.morphline.base.Notifications;
+import com.codahale.metrics.Meter;
+import com.codahale.metrics.MetricRegistry;
+import com.codahale.metrics.SharedMetricRegistries;
+import com.codahale.metrics.Timer;
+import com.typesafe.config.Config;
+import com.typesafe.config.ConfigFactory;
+
+/**
+ * A {@link MorphlineHandler} that processes it's events using a morphline {@link Command} chain.
+ */
+public class MorphlineHandlerImpl implements MorphlineHandler {
+
+  private MorphlineContext morphlineContext;
+  private Command morphline;
+  private Command finalChild;
+  private String morphlineFileAndId;
+  
+  private Timer mappingTimer;
+  private Meter numRecords;
+  private Meter numFailedRecords;
+  private Meter numExceptionRecords;
+  
+  public static final String MORPHLINE_FILE_PARAM = "morphlineFile";
+  public static final String MORPHLINE_ID_PARAM = "morphlineId";
+  
+  /**
+   * Morphline variables can be passed from flume.conf to the morphline, e.g.:
+   * agent.sinks.solrSink.morphlineVariable.zkHost=127.0.0.1:2181/solr
+   */
+  public static final String MORPHLINE_VARIABLE_PARAM = "morphlineVariable";
+
+  private static final Logger LOG = LoggerFactory.getLogger(MorphlineHandlerImpl.class);
+  
+  // For test injection
+  void setMorphlineContext(MorphlineContext morphlineContext) {
+    this.morphlineContext = morphlineContext;
+  }
+
+  // for interceptor
+  void setFinalChild(Command finalChild) {
+    this.finalChild = finalChild;
+  }
+
+  @Override
+  public void configure(Context context) {
+    String morphlineFile = context.getString(MORPHLINE_FILE_PARAM);
+    String morphlineId = context.getString(MORPHLINE_ID_PARAM);
+    if (morphlineFile == null || morphlineFile.trim().length() == 0) {
+      throw new MorphlineCompilationException("Missing parameter: " + MORPHLINE_FILE_PARAM, null);
+    }
+    morphlineFileAndId = morphlineFile + "@" + morphlineId;
+    
+    if (morphlineContext == null) {
+      FaultTolerance faultTolerance = new FaultTolerance(
+          context.getBoolean(FaultTolerance.IS_PRODUCTION_MODE, false), 
+          context.getBoolean(FaultTolerance.IS_IGNORING_RECOVERABLE_EXCEPTIONS, false),
+          context.getString(FaultTolerance.RECOVERABLE_EXCEPTION_CLASSES));
+      
+      morphlineContext = new MorphlineContext.Builder()
+        .setExceptionHandler(faultTolerance)
+        .setMetricRegistry(SharedMetricRegistries.getOrCreate(morphlineFileAndId))
+        .build();
+    }
+    
+    Config override = ConfigFactory.parseMap(
+        context.getSubProperties(MORPHLINE_VARIABLE_PARAM + "."));
+    morphline = new Compiler().compile(
+        new File(morphlineFile), morphlineId, morphlineContext, finalChild, override);
+    
+    this.mappingTimer = morphlineContext.getMetricRegistry().timer(
+        MetricRegistry.name("morphline.app", Metrics.ELAPSED_TIME));
+    this.numRecords = morphlineContext.getMetricRegistry().meter(
+        MetricRegistry.name("morphline.app", Metrics.NUM_RECORDS));
+    this.numFailedRecords = morphlineContext.getMetricRegistry().meter(
+        MetricRegistry.name("morphline.app", "numFailedRecords"));
+    this.numExceptionRecords = morphlineContext.getMetricRegistry().meter(
+        MetricRegistry.name("morphline.app", "numExceptionRecords"));
+  }
+
+  @Override
+  public void process(Event event) {
+    numRecords.mark();
+    Timer.Context timerContext = mappingTimer.time();
+    try {
+      Record record = new Record();
+      for (Entry<String, String> entry : event.getHeaders().entrySet()) {
+        record.put(entry.getKey(), entry.getValue());
+      }
+      byte[] bytes = event.getBody();
+      if (bytes != null && bytes.length > 0) {
+        record.put(Fields.ATTACHMENT_BODY, bytes);
+      }    
+      try {
+        Notifications.notifyStartSession(morphline);
+        if (!morphline.process(record)) {
+          numFailedRecords.mark();
+          LOG.warn("Morphline {} failed to process record: {}", morphlineFileAndId, record);
+        }
+      } catch (RuntimeException t) {
+        numExceptionRecords.mark();
+        morphlineContext.getExceptionHandler().handleException(t, record);
+      }
+    } finally {
+      timerContext.stop();
+    }
+  }
+
+  @Override
+  public void beginTransaction() {
+    Notifications.notifyBeginTransaction(morphline);      
+  }
+
+  @Override
+  public void commitTransaction() {
+    Notifications.notifyCommitTransaction(morphline);      
+  }
+
+  @Override
+  public void rollbackTransaction() {
+    Notifications.notifyRollbackTransaction(morphline);            
+  }
+
+  @Override
+  public void stop() {
+    Notifications.notifyShutdown(morphline);
+  }
+
+}
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineInterceptor.java b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineInterceptor.java
new file mode 100644
index 0000000..3b94133
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineInterceptor.java
@@ -0,0 +1,242 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flume.sink.solr.morphline;
+
+import java.io.IOException;
+import java.io.InputStream;
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Queue;
+import java.util.concurrent.ConcurrentLinkedQueue;
+
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.FlumeException;
+import org.apache.flume.event.EventBuilder;
+import org.apache.flume.interceptor.Interceptor;
+import org.kitesdk.morphline.api.Command;
+import org.kitesdk.morphline.api.Record;
+import org.kitesdk.morphline.base.Fields;
+
+import com.google.common.base.Preconditions;
+import com.google.common.io.ByteStreams;
+
+/**
+ * Flume Interceptor that executes a morphline on events that are intercepted.
+ * 
+ * Currently, there is a restriction in that the morphline must not generate more than one output
+ * record for each input event.
+ */
+public class MorphlineInterceptor implements Interceptor {
+
+  private final Context context;
+  private final Queue<LocalMorphlineInterceptor> pool = new ConcurrentLinkedQueue<>();
+  
+  protected MorphlineInterceptor(Context context) {
+    Preconditions.checkNotNull(context);
+    this.context = context;
+    // fail fast on morphline compilation exception
+    returnToPool(new LocalMorphlineInterceptor(context));
+  }
+
+  @Override
+  public void initialize() {
+  }
+
+  @Override
+  public void close() {
+    LocalMorphlineInterceptor interceptor;
+    while ((interceptor = pool.poll()) != null) {
+      interceptor.close();
+    }
+  }
+
+  @Override
+  public List<Event> intercept(List<Event> events) {
+    LocalMorphlineInterceptor interceptor = borrowFromPool();
+    List<Event> results = interceptor.intercept(events);
+    returnToPool(interceptor);
+    return results;
+  }
+  
+  @Override
+  public Event intercept(Event event) {
+    LocalMorphlineInterceptor interceptor = borrowFromPool();
+    Event result = interceptor.intercept(event);
+    returnToPool(interceptor);
+    return result;
+  }
+
+  private void returnToPool(LocalMorphlineInterceptor interceptor) {
+    pool.add(interceptor);
+  }
+  
+  private LocalMorphlineInterceptor borrowFromPool() {
+    LocalMorphlineInterceptor interceptor = pool.poll();
+    if (interceptor == null) {
+      interceptor = new LocalMorphlineInterceptor(context);
+    }
+    return interceptor;
+  }
+
+  
+  ///////////////////////////////////////////////////////////////////////////////
+  // Nested classes:
+  ///////////////////////////////////////////////////////////////////////////////
+  /** Builder implementations MUST have a public no-arg constructor */
+  public static class Builder implements Interceptor.Builder {
+
+    private Context context;
+
+    public Builder() {
+    }
+
+    @Override
+    public MorphlineInterceptor build() {
+      return new MorphlineInterceptor(context);
+    }
+
+    @Override
+    public void configure(Context context) {
+      this.context = context;
+    }
+
+  }
+
+  
+  ///////////////////////////////////////////////////////////////////////////////
+  // Nested classes:
+  ///////////////////////////////////////////////////////////////////////////////
+  private static final class LocalMorphlineInterceptor implements Interceptor {
+
+    private final MorphlineHandlerImpl morphline;
+    private final Collector collector;
+    
+    protected LocalMorphlineInterceptor(Context context) {
+      this.morphline = new MorphlineHandlerImpl();
+      this.collector = new Collector();
+      this.morphline.setFinalChild(collector);
+      this.morphline.configure(context);
+    }
+
+    @Override
+    public void initialize() {
+    }
+
+    @Override
+    public void close() {
+      morphline.stop();
+    }
+
+    @Override
+    public List<Event> intercept(List<Event> events) {
+      List results = new ArrayList(events.size());
+      for (Event event : events) {
+        event = intercept(event);
+        if (event != null) {
+          results.add(event);
+        }
+      }
+      return results;
+    }
+
+    @Override
+    public Event intercept(Event event) {
+      collector.reset();
+      morphline.process(event);
+      List<Record> results = collector.getRecords();
+      if (results.size() == 0) {
+        return null;
+      }
+      if (results.size() > 1) {
+        throw new FlumeException(getClass().getName() + 
+            " must not generate more than one output record per input event");
+      }
+      Event result = toEvent(results.get(0));    
+      return result;
+    }
+    
+    private Event toEvent(Record record) {
+      Map<String, String> headers = new HashMap();
+      Map<String, Collection<Object>> recordMap = record.getFields().asMap();
+      byte[] body = null;
+      for (Map.Entry<String, Collection<Object>> entry : recordMap.entrySet()) {
+        if (entry.getValue().size() > 1) {
+          throw new FlumeException(getClass().getName()
+              + " must not generate more than one output value per record field");
+        }
+        assert entry.getValue().size() != 0; // guava guarantees that
+        Object firstValue = entry.getValue().iterator().next();
+        if (Fields.ATTACHMENT_BODY.equals(entry.getKey())) {
+          if (firstValue instanceof byte[]) {
+            body = (byte[]) firstValue;
+          } else if (firstValue instanceof InputStream) {
+            try {
+              body = ByteStreams.toByteArray((InputStream) firstValue);
+            } catch (IOException e) {
+              throw new FlumeException(e);
+            }            
+          } else {
+            throw new FlumeException(getClass().getName()
+                + " must non generate attachments that are not a byte[] or InputStream");
+          }
+        } else {
+          headers.put(entry.getKey(), firstValue.toString());
+        }
+      }
+      return EventBuilder.withBody(body, headers);
+    }
+  }
+  
+  
+  ///////////////////////////////////////////////////////////////////////////////
+  // Nested classes:
+  ///////////////////////////////////////////////////////////////////////////////
+  private static final class Collector implements Command {
+    
+    private final List<Record> results = new ArrayList();
+    
+    public List<Record> getRecords() {
+      return results;
+    }
+    
+    public void reset() {
+      results.clear();
+    }
+
+    @Override
+    public Command getParent() {
+      return null;
+    }
+    
+    @Override
+    public void notify(Record notification) {
+    }
+
+    @Override
+    public boolean process(Record record) {
+      Preconditions.checkNotNull(record);
+      results.add(record);
+      return true;
+    }
+    
+  }
+
+}
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSink.java b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSink.java
new file mode 100644
index 0000000..0917d39
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSink.java
@@ -0,0 +1,204 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flume.sink.solr.morphline;
+
+import org.apache.flume.Channel;
+import org.apache.flume.ChannelException;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.EventDeliveryException;
+import org.apache.flume.Transaction;
+import org.apache.flume.conf.Configurable;
+import org.apache.flume.conf.ConfigurationException;
+import org.apache.flume.conf.LogPrivacyUtil;
+import org.apache.flume.instrumentation.SinkCounter;
+import org.apache.flume.sink.AbstractSink;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.kitesdk.morphline.api.Command;
+
+/**
+ * Flume sink that extracts search documents from Flume events and processes them using a morphline
+ * {@link Command} chain.
+ */
+public class MorphlineSink extends AbstractSink implements Configurable {
+
+  private int maxBatchSize = 1000;
+  private long maxBatchDurationMillis = 1000;
+  private String handlerClass;
+  private MorphlineHandler handler;
+  private Context context;
+  private SinkCounter sinkCounter;
+
+  public static final String BATCH_SIZE = "batchSize";
+  public static final String BATCH_DURATION_MILLIS = "batchDurationMillis";
+  public static final String HANDLER_CLASS = "handlerClass";
+  
+  private static final Logger LOGGER = LoggerFactory.getLogger(MorphlineSink.class);
+
+  public MorphlineSink() {
+    this(null);
+  }
+
+  /** For testing only */
+  protected MorphlineSink(MorphlineHandler handler) {
+    this.handler = handler;
+  }
+
+  @Override
+  public void configure(Context context) {
+    this.context = context;
+    maxBatchSize = context.getInteger(BATCH_SIZE, maxBatchSize);
+    maxBatchDurationMillis = context.getLong(BATCH_DURATION_MILLIS, maxBatchDurationMillis);
+    handlerClass = context.getString(HANDLER_CLASS, MorphlineHandlerImpl.class.getName());    
+    if (sinkCounter == null) {
+      sinkCounter = new SinkCounter(getName());
+    }
+  }
+
+  /**
+   * Returns the maximum number of events to take per flume transaction;
+   * override to customize
+   */
+  private int getMaxBatchSize() {
+    return maxBatchSize;
+  }
+
+  /** Returns the maximum duration per flume transaction; override to customize */
+  private long getMaxBatchDurationMillis() {
+    return maxBatchDurationMillis;
+  }
+
+  @Override
+  public synchronized void start() {
+    LOGGER.info("Starting Morphline Sink {} ...", this);
+    sinkCounter.start();
+    if (handler == null) {
+      MorphlineHandler tmpHandler;
+      try {
+        tmpHandler = (MorphlineHandler) Class.forName(handlerClass).newInstance();
+      } catch (Exception e) {
+        throw new ConfigurationException(e);
+      }
+      tmpHandler.configure(context);
+      handler = tmpHandler;
+    }    
+    super.start();
+    LOGGER.info("Morphline Sink {} started.", getName());
+  }
+
+  @Override
+  public synchronized void stop() {
+    LOGGER.info("Morphline Sink {} stopping...", getName());
+    try {
+      if (handler != null) {
+        handler.stop();
+      }
+      sinkCounter.stop();
+      LOGGER.info("Morphline Sink {} stopped. Metrics: {}, {}", getName(), sinkCounter);
+    } finally {
+      super.stop();
+    }
+  }
+
+  @Override
+  public Status process() throws EventDeliveryException {
+    int batchSize = getMaxBatchSize();
+    long batchEndTime = System.currentTimeMillis() + getMaxBatchDurationMillis();
+    Channel myChannel = getChannel();
+    Transaction txn = myChannel.getTransaction();
+    txn.begin();
+    boolean isMorphlineTransactionCommitted = true;
+    try {
+      int numEventsTaken = 0;
+      handler.beginTransaction();
+      isMorphlineTransactionCommitted = false;
+
+      // repeatedly take and process events from the Flume queue
+      for (int i = 0; i < batchSize; i++) {
+        Event event = myChannel.take();
+        if (event == null) {
+          break;
+        }
+        sinkCounter.incrementEventDrainAttemptCount();
+        numEventsTaken++;
+        if (LOGGER.isTraceEnabled() && LogPrivacyUtil.allowLogRawData()) {
+          LOGGER.trace("Flume event arrived {}", event);
+        }
+
+        //StreamEvent streamEvent = createStreamEvent(event);
+        handler.process(event);
+        if (System.currentTimeMillis() >= batchEndTime) {
+          break;
+        }
+      }
+
+      // update metrics
+      if (numEventsTaken == 0) {
+        sinkCounter.incrementBatchEmptyCount();
+      }
+      if (numEventsTaken < batchSize) {
+        sinkCounter.incrementBatchUnderflowCount();
+      } else {
+        sinkCounter.incrementBatchCompleteCount();
+      }
+      handler.commitTransaction();
+      isMorphlineTransactionCommitted = true;
+      txn.commit();
+      sinkCounter.addToEventDrainSuccessCount(numEventsTaken);
+      return numEventsTaken == 0 ? Status.BACKOFF : Status.READY;
+    } catch (Throwable t) {
+      // Ooops - need to rollback and back off
+      LOGGER.error("Morphline Sink " + getName() + ": Unable to process event from channel " +
+          myChannel.getName() + ". Exception follows.", t);
+      try {
+        if (!isMorphlineTransactionCommitted) {
+          handler.rollbackTransaction();
+        }
+      } catch (Throwable t2) {
+        LOGGER.error("Morphline Sink " + getName() +
+            ": Unable to rollback morphline transaction. Exception follows.", t2);
+      } finally {
+        try {
+          txn.rollback();
+        } catch (Throwable t4) {
+          LOGGER.error("Morphline Sink " + getName() + ": Unable to rollback Flume transaction. " +
+              "Exception follows.", t4);
+        }
+      }
+
+      if (t instanceof Error) {
+        throw (Error) t; // rethrow original exception
+      } else if (t instanceof ChannelException) {
+        return Status.BACKOFF;
+      } else {
+        throw new EventDeliveryException("Failed to send events", t); // rethrow and backoff
+      }
+    } finally {
+      txn.close();
+    }
+  }
+  
+  @Override
+  public String toString() {
+    int i = getClass().getName().lastIndexOf('.') + 1;
+    String shortClassName = getClass().getName().substring(i);
+    return getName() + " (" + shortClassName + ")";
+  }
+
+}
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSolrSink.java b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSolrSink.java
new file mode 100644
index 0000000..e403b10
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSolrSink.java
@@ -0,0 +1,49 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flume.sink.solr.morphline;
+
+import org.apache.flume.Context;
+
+import org.kitesdk.morphline.api.Command;
+import org.kitesdk.morphline.base.FaultTolerance;
+
+
+/**
+ * Flume sink that extracts search documents from Flume events, processes them using a morphline
+ * {@link Command} chain, and loads them into Apache Solr.
+ */
+public class MorphlineSolrSink extends MorphlineSink {
+
+  public MorphlineSolrSink() {
+    super();
+  }
+  
+  /** For testing only */
+  protected MorphlineSolrSink(MorphlineHandler handler) {
+    super(handler);
+  }
+
+  @Override
+  public void configure(Context context) {
+    if (context.getString(FaultTolerance.RECOVERABLE_EXCEPTION_CLASSES) == null) {
+      context.put(FaultTolerance.RECOVERABLE_EXCEPTION_CLASSES, 
+          "org.apache.solr.client.solrj.SolrServerException");      
+    }
+    super.configure(context);
+  }
+
+}
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/UUIDInterceptor.java b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/UUIDInterceptor.java
new file mode 100644
index 0000000..22d5347
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/UUIDInterceptor.java
@@ -0,0 +1,115 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flume.sink.solr.morphline;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.UUID;
+
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.interceptor.Interceptor;
+
+/**
+ * Flume Interceptor that sets a universally unique identifier on all events
+ * that are intercepted. By default this event header is named "id".
+ */
+public class UUIDInterceptor implements Interceptor {
+
+  private String headerName;
+  private boolean preserveExisting;
+  private String prefix;
+
+  public static final String HEADER_NAME = "headerName";
+  public static final String PRESERVE_EXISTING_NAME = "preserveExisting";
+  public static final String PREFIX_NAME = "prefix";
+
+  protected UUIDInterceptor(Context context) {
+    headerName = context.getString(HEADER_NAME, "id");
+    preserveExisting = context.getBoolean(PRESERVE_EXISTING_NAME, true);
+    prefix = context.getString(PREFIX_NAME, "");
+  }
+
+  @Override
+  public void initialize() {
+  }
+
+  protected String getPrefix() {
+    return prefix;
+  }
+
+  protected String generateUUID() {
+    return getPrefix() + UUID.randomUUID().toString();
+  }
+
+  protected boolean isMatch(Event event) {
+    return true;
+  }
+
+  @Override
+  public Event intercept(Event event) {
+    Map<String, String> headers = event.getHeaders();
+    if (preserveExisting && headers.containsKey(headerName)) {
+      // we must preserve the existing id
+    } else if (isMatch(event)) {
+      headers.put(headerName, generateUUID());
+    }
+    return event;
+  }
+
+  @Override
+  public List<Event> intercept(List<Event> events) {
+    List results = new ArrayList(events.size());
+    for (Event event : events) {
+      event = intercept(event);
+      if (event != null) {
+        results.add(event);
+      }
+    }
+    return results;
+  }
+
+  @Override
+  public void close() {
+  }
+
+  
+  ///////////////////////////////////////////////////////////////////////////////
+  // Nested classes:
+  ///////////////////////////////////////////////////////////////////////////////
+  /** Builder implementations MUST have a public no-arg constructor */
+  public static class Builder implements Interceptor.Builder {
+
+    private Context context;
+
+    public Builder() {
+    }
+
+    @Override
+    public UUIDInterceptor build() {
+      return new UUIDInterceptor(context);
+    }
+
+    @Override
+    public void configure(Context context) {
+      this.context = context;
+    }
+
+  }
+
+}
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/java/org/apache/flume/sink/solr/morphline/EmbeddedSource.java b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/java/org/apache/flume/sink/solr/morphline/EmbeddedSource.java
new file mode 100644
index 0000000..b30fece
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/java/org/apache/flume/sink/solr/morphline/EmbeddedSource.java
@@ -0,0 +1,45 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flume.sink.solr.morphline;
+
+import java.util.List;
+
+import org.apache.flume.Event;
+import org.apache.flume.EventDeliveryException;
+import org.apache.flume.EventDrivenSource;
+import org.apache.flume.Sink;
+import org.apache.flume.source.AbstractSource;
+
+class EmbeddedSource extends AbstractSource implements EventDrivenSource {
+
+  private Sink sink;
+
+  public EmbeddedSource(Sink sink) {
+    this.sink = sink;
+  }
+
+  public void load(Event event) throws EventDeliveryException {
+    getChannelProcessor().processEvent(event);
+    sink.process();
+  }
+
+  public void load(List<Event> events) throws EventDeliveryException {
+    getChannelProcessor().processEventBatch(events);
+    sink.process();
+  }
+
+}
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/java/org/apache/flume/sink/solr/morphline/FlumeHttpServletRequestWrapper.java b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/java/org/apache/flume/sink/solr/morphline/FlumeHttpServletRequestWrapper.java
new file mode 100644
index 0000000..9711a3a
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/java/org/apache/flume/sink/solr/morphline/FlumeHttpServletRequestWrapper.java
@@ -0,0 +1,321 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flume.sink.solr.morphline;
+
+import java.io.BufferedReader;
+import java.io.ByteArrayInputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.io.UnsupportedEncodingException;
+import java.security.Principal;
+import java.util.Collections;
+import java.util.Enumeration;
+import java.util.Locale;
+import java.util.Map;
+
+import javax.servlet.RequestDispatcher;
+import javax.servlet.ServletInputStream;
+import javax.servlet.http.Cookie;
+import javax.servlet.http.HttpServletRequest;
+import javax.servlet.http.HttpSession;
+
+class FlumeHttpServletRequestWrapper implements HttpServletRequest {
+
+  private ServletInputStream stream;
+  private String charset;
+  
+  public FlumeHttpServletRequestWrapper(final byte[] data) {
+    stream = new ServletInputStream() {
+      private final InputStream in = new ByteArrayInputStream(data);      
+      @Override
+      public int read() throws IOException {
+        return in.read();
+      }
+    };
+  }
+
+  @Override
+  public String getAuthType() {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public Cookie[] getCookies() {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public long getDateHeader(String name) {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public String getHeader(String name) {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public Enumeration getHeaders(String name) {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public Enumeration getHeaderNames() {
+    return Collections.enumeration(Collections.EMPTY_LIST);
+  }
+
+  @Override
+  public int getIntHeader(String name) {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public String getMethod() {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public String getPathInfo() {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public String getPathTranslated() {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public String getContextPath() {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public String getQueryString() {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public String getRemoteUser() {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public boolean isUserInRole(String role) {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public Principal getUserPrincipal() {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public String getRequestedSessionId() {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public String getRequestURI() {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public StringBuffer getRequestURL() {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public String getServletPath() {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public HttpSession getSession(boolean create) {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public HttpSession getSession() {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public boolean isRequestedSessionIdValid() {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public boolean isRequestedSessionIdFromCookie() {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public boolean isRequestedSessionIdFromURL() {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public boolean isRequestedSessionIdFromUrl() {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public Object getAttribute(String name) {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public Enumeration getAttributeNames() {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public String getCharacterEncoding() {
+    return charset;
+  }
+
+  @Override
+  public void setCharacterEncoding(String env) throws UnsupportedEncodingException {
+    this.charset = env;
+  }
+
+  @Override
+  public int getContentLength() {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public String getContentType() {
+    return null;
+  }
+
+  @Override
+  public ServletInputStream getInputStream() throws IOException {
+    return stream;
+  }
+
+  @Override
+  public String getParameter(String name) {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public Enumeration getParameterNames() {
+    return Collections.enumeration(Collections.EMPTY_LIST);
+  }
+
+  @Override
+  public String[] getParameterValues(String name) {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public Map getParameterMap() {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public String getProtocol() {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public String getScheme() {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public String getServerName() {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public int getServerPort() {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public BufferedReader getReader() throws IOException {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public String getRemoteAddr() {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public String getRemoteHost() {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public void setAttribute(String name, Object o) {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public void removeAttribute(String name) {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public Locale getLocale() {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public Enumeration getLocales() {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public boolean isSecure() {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public RequestDispatcher getRequestDispatcher(String path) {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public String getRealPath(String path) {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public int getRemotePort() {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public String getLocalName() {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public String getLocalAddr() {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+
+  @Override
+  public int getLocalPort() {
+    throw new UnsupportedOperationException("Not supported yet.");
+  }
+}
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/java/org/apache/flume/sink/solr/morphline/ResettableTestStringInputStream.java b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/java/org/apache/flume/sink/solr/morphline/ResettableTestStringInputStream.java
new file mode 100644
index 0000000..e6ee9b9
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/java/org/apache/flume/sink/solr/morphline/ResettableTestStringInputStream.java
@@ -0,0 +1,89 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flume.sink.solr.morphline;
+
+import java.io.IOException;
+
+import org.apache.flume.serialization.ResettableInputStream;
+
+class ResettableTestStringInputStream extends ResettableInputStream {
+
+  private String str;
+  private int markPos = 0;
+  private int curPos = 0;
+
+  /**
+   * Warning: This test class does not handle character/byte conversion at all!
+   * @param str String to use for testing
+   */
+  public ResettableTestStringInputStream(String str) {
+    this.str = str;
+  }
+
+  @Override
+  public int readChar() throws IOException {
+    throw new UnsupportedOperationException("This test class doesn't return " +
+        "strings!");
+  }
+
+  @Override
+  public void mark() throws IOException {
+    markPos = curPos;
+  }
+
+  @Override
+  public void reset() throws IOException {
+    curPos = markPos;
+  }
+
+  @Override
+  public void seek(long position) throws IOException {
+    throw new UnsupportedOperationException("Unimplemented in test class");
+  }
+
+  @Override
+  public long tell() throws IOException {
+    throw new UnsupportedOperationException("Unimplemented in test class");
+  }
+
+  @Override
+  public int read() throws IOException {
+    if (curPos >= str.length()) {
+      return -1;
+    }
+    return str.charAt(curPos++);
+  }
+
+  @Override
+  public int read(byte[] b, int off, int len) throws IOException {
+    if (curPos >= str.length()) {
+      return -1;
+    }
+    int n = 0;
+    while (len > 0 && curPos < str.length()) {
+      b[off++] = (byte) str.charAt(curPos++);
+      n++;
+      len--;
+    }
+    return n;
+  }
+
+  @Override
+  public void close() throws IOException {
+    // no-op
+  }
+}
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/java/org/apache/flume/sink/solr/morphline/TestBlobDeserializer.java b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/java/org/apache/flume/sink/solr/morphline/TestBlobDeserializer.java
new file mode 100644
index 0000000..be377ba
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/java/org/apache/flume/sink/solr/morphline/TestBlobDeserializer.java
@@ -0,0 +1,122 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flume.sink.solr.morphline;
+
+import com.google.common.base.Charsets;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.serialization.EventDeserializer;
+import org.apache.flume.serialization.EventDeserializerFactory;
+import org.apache.flume.serialization.ResettableInputStream;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+
+import java.io.IOException;
+import java.util.List;
+
+public class TestBlobDeserializer extends Assert {
+
+  private String mini;
+
+  @Before
+  public void setup() {
+    StringBuilder sb = new StringBuilder();
+    sb.append("line 1\n");
+    sb.append("line 2\n");
+    mini = sb.toString();
+  }
+
+  @Test
+  public void testSimple() throws IOException {
+    ResettableInputStream in = new ResettableTestStringInputStream(mini);
+    EventDeserializer des = new BlobDeserializer(new Context(), in);
+    validateMiniParse(des);
+  }
+
+  @Test
+  public void testSimpleViaBuilder() throws IOException {
+    ResettableInputStream in = new ResettableTestStringInputStream(mini);
+    EventDeserializer.Builder builder = new BlobDeserializer.Builder();
+    EventDeserializer des = builder.build(new Context(), in);
+    validateMiniParse(des);
+  }
+
+  @Test
+  public void testSimpleViaFactory() throws IOException {
+    ResettableInputStream in = new ResettableTestStringInputStream(mini);
+    EventDeserializer des;
+    des = EventDeserializerFactory.getInstance(BlobDeserializer.Builder.class.getName(),
+                                               new Context(), in);
+    validateMiniParse(des);
+  }
+
+  @Test
+  public void testBatch() throws IOException {
+    ResettableInputStream in = new ResettableTestStringInputStream(mini);
+    EventDeserializer des = new BlobDeserializer(new Context(), in);
+    List<Event> events;
+
+    events = des.readEvents(10); // try to read more than we should have
+    assertEquals(1, events.size());
+    assertEventBodyEquals(mini, events.get(0));
+
+    des.mark();
+    des.close();
+  }
+
+  // truncation occurs at maxLineLength boundaries
+  @Test
+  public void testMaxLineLength() throws IOException {
+    String longLine = "abcdefghijklmnopqrstuvwxyz\n";
+    Context ctx = new Context();
+    ctx.put(BlobDeserializer.MAX_BLOB_LENGTH_KEY, "10");
+
+    ResettableInputStream in = new ResettableTestStringInputStream(longLine);
+    EventDeserializer des = new BlobDeserializer(ctx, in);
+
+    assertEventBodyEquals("abcdefghij", des.readEvent());
+    assertEventBodyEquals("klmnopqrst", des.readEvent());
+    assertEventBodyEquals("uvwxyz\n", des.readEvent());
+    assertNull(des.readEvent());
+  }
+
+  private void assertEventBodyEquals(String expected, Event event) {
+    String bodyStr = new String(event.getBody(), Charsets.UTF_8);
+    assertEquals(expected, bodyStr);
+  }
+
+  private void validateMiniParse(EventDeserializer des) throws IOException {
+    Event evt;
+
+    des.mark();
+    evt = des.readEvent();
+    assertEquals(new String(evt.getBody()), mini);
+    des.reset(); // reset!
+
+    evt = des.readEvent();
+    assertEquals("data should be repeated, " +
+        "because we reset() the stream", new String(evt.getBody()), mini);
+
+    evt = des.readEvent();
+    assertNull("Event should be null because there are no lines " +
+        "left to read", evt);
+
+    des.mark();
+    des.close();
+  }
+}
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/java/org/apache/flume/sink/solr/morphline/TestBlobHandler.java b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/java/org/apache/flume/sink/solr/morphline/TestBlobHandler.java
new file mode 100644
index 0000000..3e7de99
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/java/org/apache/flume/sink/solr/morphline/TestBlobHandler.java
@@ -0,0 +1,60 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flume.sink.solr.morphline;
+
+import java.util.List;
+
+import javax.servlet.http.HttpServletRequest;
+
+import org.apache.flume.Event;
+import org.apache.flume.source.http.HTTPSourceHandler;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+
+public class TestBlobHandler extends Assert {
+
+  private HTTPSourceHandler handler;
+
+  @Before
+  public void setUp() {
+    handler = new BlobHandler();
+  }
+
+  @Test
+  public void testSingleEvent() throws Exception {
+    byte[] json = "foo".getBytes("UTF-8");
+    HttpServletRequest req = new FlumeHttpServletRequestWrapper(json);
+    List<Event> deserialized = handler.getEvents(req);
+    assertEquals(1,  deserialized.size());
+    Event e = deserialized.get(0);
+    assertEquals(0, e.getHeaders().size());
+    assertEquals("foo", new String(e.getBody(),"UTF-8"));
+  }
+
+  @Test
+  public void testEmptyEvent() throws Exception {
+    byte[] json = "".getBytes("UTF-8");
+    HttpServletRequest req = new FlumeHttpServletRequestWrapper(json);
+    List<Event> deserialized = handler.getEvents(req);
+    assertEquals(1,  deserialized.size());
+    Event e = deserialized.get(0);
+    assertEquals(0, e.getHeaders().size());
+    assertEquals("", new String(e.getBody(),"UTF-8"));
+  }
+
+}
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/java/org/apache/flume/sink/solr/morphline/TestEnvironment.java b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/java/org/apache/flume/sink/solr/morphline/TestEnvironment.java
new file mode 100644
index 0000000..933a6b1
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/java/org/apache/flume/sink/solr/morphline/TestEnvironment.java
@@ -0,0 +1,33 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flume.sink.solr.morphline;
+
+import java.net.UnknownHostException;
+
+import org.junit.Test;
+
+import org.kitesdk.morphline.solr.EnvironmentTest;
+
+/** Print and verify some info about the environment in which the unit tests are running */
+public class TestEnvironment extends EnvironmentTest {
+
+  @Test
+  public void testEnvironment() throws UnknownHostException {
+    super.testEnvironment();
+  }
+  
+}
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/java/org/apache/flume/sink/solr/morphline/TestMorphlineInterceptor.java b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/java/org/apache/flume/sink/solr/morphline/TestMorphlineInterceptor.java
new file mode 100644
index 0000000..8d62d38
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/java/org/apache/flume/sink/solr/morphline/TestMorphlineInterceptor.java
@@ -0,0 +1,167 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flume.sink.solr.morphline;
+
+import com.google.common.base.Charsets;
+import com.google.common.collect.ImmutableMap;
+import com.google.common.io.Files;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.event.EventBuilder;
+import org.junit.Assert;
+import org.junit.Test;
+import org.kitesdk.morphline.base.Fields;
+
+import java.io.File;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+public class TestMorphlineInterceptor extends Assert {
+
+  private static final String RESOURCES_DIR = "target/test-classes";
+
+  @Test
+  public void testNoOperation() throws Exception {
+    Context context = new Context();
+    context.put(MorphlineHandlerImpl.MORPHLINE_FILE_PARAM,
+                RESOURCES_DIR + "/test-morphlines/noOperation.conf");
+    Event input = EventBuilder.withBody("foo", Charsets.UTF_8);
+    input.getHeaders().put("name", "nadja");
+    MorphlineInterceptor interceptor = build(context);
+    Event actual = interceptor.intercept(input);
+    interceptor.close();
+    Event expected = EventBuilder.withBody("foo".getBytes("UTF-8"),
+                                           ImmutableMap.of("name", "nadja"));
+    assertEqualsEvent(expected, actual);
+    
+    List<Event> actualList = build(context).intercept(Collections.singletonList(input));
+    List<Event> expectedList = Collections.singletonList(expected);
+    assertEqualsEventList(expectedList, actualList);
+  }
+
+  @Test
+  public void testReadClob() throws Exception {
+    Context context = new Context();
+    context.put(MorphlineHandlerImpl.MORPHLINE_FILE_PARAM,
+                RESOURCES_DIR + "/test-morphlines/readClob.conf");
+    Event input = EventBuilder.withBody("foo", Charsets.UTF_8);
+    input.getHeaders().put("name", "nadja");
+    Event actual = build(context).intercept(input);
+    Event expected = EventBuilder.withBody(null,
+                                           ImmutableMap.of("name", "nadja", Fields.MESSAGE, "foo"));
+    assertEqualsEvent(expected, actual);
+
+    List<Event> actualList = build(context).intercept(Collections.singletonList(input));
+    List<Event> expectedList = Collections.singletonList(expected);
+    assertEqualsEventList(expectedList, actualList);
+  }
+
+  @Test
+  public void testGrokIfNotMatchDropEventRetain() throws Exception {
+    Context context = new Context();
+    context.put(MorphlineHandlerImpl.MORPHLINE_FILE_PARAM,
+                RESOURCES_DIR + "/test-morphlines/grokIfNotMatchDropRecord.conf");
+
+    String msg = "<164>Feb  4 10:46:14 syslog sshd[607]: Server listening on 0.0.0.0 port 22.";
+    Event input = EventBuilder.withBody(null, ImmutableMap.of(Fields.MESSAGE, msg));
+    Event actual = build(context).intercept(input);
+
+    Map<String, String> expected = new HashMap();
+    expected.put(Fields.MESSAGE, msg);
+    expected.put("syslog_pri", "164");
+    expected.put("syslog_timestamp", "Feb  4 10:46:14");
+    expected.put("syslog_hostname", "syslog");
+    expected.put("syslog_program", "sshd");
+    expected.put("syslog_pid", "607");
+    expected.put("syslog_message", "Server listening on 0.0.0.0 port 22.");
+    Event expectedEvent = EventBuilder.withBody(null, expected);
+    assertEqualsEvent(expectedEvent, actual);
+  }
+
+  @Test
+  /* leading XXXXX does not match regex, thus we expect the event to be dropped */
+  public void testGrokIfNotMatchDropEventDrop() throws Exception {
+    Context context = new Context();
+    context.put(MorphlineHandlerImpl.MORPHLINE_FILE_PARAM,
+                RESOURCES_DIR + "/test-morphlines/grokIfNotMatchDropRecord.conf");
+    String msg = "<XXXXXXXXXXXXX164>Feb  4 10:46:14 syslog sshd[607]: Server listening on 0.0.0.0" +
+                     " port 22.";
+    Event input = EventBuilder.withBody(null, ImmutableMap.of(Fields.MESSAGE, msg));
+    Event actual = build(context).intercept(input);
+    assertNull(actual);
+  }
+
+  @Test
+  /** morphline says route to southpole if it's an avro file, otherwise route to northpole */
+  public void testIfDetectMimeTypeRouteToSouthPole() throws Exception {
+    Context context = new Context();
+    context.put(MorphlineHandlerImpl.MORPHLINE_FILE_PARAM,
+                RESOURCES_DIR + "/test-morphlines/ifDetectMimeType.conf");
+    context.put(MorphlineHandlerImpl.MORPHLINE_VARIABLE_PARAM + ".MY.MIME_TYPE", "avro/binary");
+
+    Event input = EventBuilder.withBody(Files.toByteArray(
+        new File(RESOURCES_DIR + "/test-documents/sample-statuses-20120906-141433.avro")));
+    Event actual = build(context).intercept(input);
+
+    Map<String, String> expected = new HashMap();
+    expected.put(Fields.ATTACHMENT_MIME_TYPE, "avro/binary");
+    expected.put("flume.selector.header", "goToSouthPole");
+    Event expectedEvent = EventBuilder.withBody(input.getBody(), expected);
+    assertEqualsEvent(expectedEvent, actual);
+  }
+
+  @Test
+  /** morphline says route to southpole if it's an avro file, otherwise route to northpole */
+  public void testIfDetectMimeTypeRouteToNorthPole() throws Exception {
+    Context context = new Context();
+    context.put(MorphlineHandlerImpl.MORPHLINE_FILE_PARAM,
+                RESOURCES_DIR + "/test-morphlines/ifDetectMimeType.conf");
+    context.put(MorphlineHandlerImpl.MORPHLINE_VARIABLE_PARAM + ".MY.MIME_TYPE", "avro/binary");
+
+    Event input = EventBuilder.withBody(
+        Files.toByteArray(new File(RESOURCES_DIR + "/test-documents/testPDF.pdf")));
+    Event actual = build(context).intercept(input);
+
+    Map<String, String> expected = new HashMap();
+    expected.put(Fields.ATTACHMENT_MIME_TYPE, "application/pdf");
+    expected.put("flume.selector.header", "goToNorthPole");
+    Event expectedEvent = EventBuilder.withBody(input.getBody(), expected);
+    assertEqualsEvent(expectedEvent, actual);
+  }
+
+  private MorphlineInterceptor build(Context context) {
+    MorphlineInterceptor.Builder builder = new MorphlineInterceptor.Builder();
+    builder.configure(context);
+    return builder.build();
+  }
+
+  // b/c SimpleEvent doesn't implement equals() method :-(
+  private void assertEqualsEvent(Event x, Event y) {
+    assertEquals(x.getHeaders(), y.getHeaders());
+    assertArrayEquals(x.getBody(), y.getBody());
+  }
+
+  private void assertEqualsEventList(List<Event> x, List<Event> y) {
+    assertEquals(x.size(), y.size());
+    for (int i = 0; i < x.size(); i++) {
+      assertEqualsEvent(x.get(i), y.get(i));      
+    }
+  }
+
+}
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/java/org/apache/flume/sink/solr/morphline/TestMorphlineSolrSink.java b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/java/org/apache/flume/sink/solr/morphline/TestMorphlineSolrSink.java
new file mode 100644
index 0000000..1bfae95
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/java/org/apache/flume/sink/solr/morphline/TestMorphlineSolrSink.java
@@ -0,0 +1,431 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flume.sink.solr.morphline;
+
+import java.io.File;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.UUID;
+import java.util.concurrent.atomic.AtomicInteger;
+
+import org.apache.flume.Channel;
+import org.apache.flume.ChannelSelector;
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.EventDeliveryException;
+import org.apache.flume.channel.ChannelProcessor;
+import org.apache.flume.channel.MemoryChannel;
+import org.apache.flume.channel.ReplicatingChannelSelector;
+import org.apache.flume.conf.Configurables;
+import org.apache.flume.event.EventBuilder;
+import org.apache.solr.SolrTestCaseJ4;
+import org.apache.solr.client.solrj.SolrQuery;
+import org.apache.solr.client.solrj.SolrServer;
+import org.apache.solr.client.solrj.SolrServerException;
+import org.apache.solr.client.solrj.response.QueryResponse;
+import org.apache.solr.common.SolrDocument;
+import org.junit.After;
+import org.junit.Before;
+import org.junit.BeforeClass;
+import org.junit.Test;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.kitesdk.morphline.api.MorphlineContext;
+import org.kitesdk.morphline.api.Record;
+import org.kitesdk.morphline.base.FaultTolerance;
+import org.kitesdk.morphline.base.Fields;
+import org.kitesdk.morphline.solr.DocumentLoader;
+import org.kitesdk.morphline.solr.SolrLocator;
+import org.kitesdk.morphline.solr.SolrMorphlineContext;
+import org.kitesdk.morphline.solr.SolrServerDocumentLoader;
+import org.kitesdk.morphline.solr.TestEmbeddedSolrServer;
+import com.codahale.metrics.MetricRegistry;
+import com.google.common.base.Charsets;
+import com.google.common.collect.ImmutableListMultimap;
+import com.google.common.collect.ListMultimap;
+import com.google.common.io.Files;
+
+public class TestMorphlineSolrSink extends SolrTestCaseJ4 {
+
+  private EmbeddedSource source;
+  private SolrServer solrServer;
+  private MorphlineSink sink;
+  private Map<String,Integer> expectedRecords;
+
+  private File tmpFile;
+  private static final boolean TEST_WITH_EMBEDDED_SOLR_SERVER = true;
+  private static final String EXTERNAL_SOLR_SERVER_URL = System.getProperty("externalSolrServer");
+//private static final String EXTERNAL_SOLR_SERVER_URL = "http://127.0.0.1:8983/solr";
+  private static final String RESOURCES_DIR = "target/test-classes";
+//private static final String RESOURCES_DIR = "src/test/resources";
+  private static final AtomicInteger SEQ_NUM = new AtomicInteger();
+  private static final AtomicInteger SEQ_NUM2 = new AtomicInteger();
+  private static final Logger LOGGER = LoggerFactory.getLogger(TestMorphlineSolrSink.class);
+
+  @BeforeClass
+  public static void beforeClass() throws Exception {
+    initCore(
+        RESOURCES_DIR + "/solr/collection1/conf/solrconfig.xml", 
+        RESOURCES_DIR + "/solr/collection1/conf/schema.xml",
+        RESOURCES_DIR + "/solr");
+  }
+
+  @Before
+  @Override
+  public void setUp() throws Exception {
+    super.setUp();
+    String path = RESOURCES_DIR + "/test-documents";
+    expectedRecords = new HashMap();
+    expectedRecords.put(path + "/sample-statuses-20120906-141433.avro", 2);
+    expectedRecords.put(path + "/sample-statuses-20120906-141433", 2);
+    expectedRecords.put(path + "/sample-statuses-20120906-141433.gz", 2);
+    expectedRecords.put(path + "/sample-statuses-20120906-141433.bz2", 2);
+    expectedRecords.put(path + "/cars.csv", 5);
+    expectedRecords.put(path + "/cars.csv.gz", 5);
+    expectedRecords.put(path + "/cars.tar.gz", 4);
+    expectedRecords.put(path + "/cars.tsv", 5);
+    expectedRecords.put(path + "/cars.ssv", 5);
+
+    final Map<String, String> context = new HashMap();
+    
+    if (EXTERNAL_SOLR_SERVER_URL != null) {
+      throw new UnsupportedOperationException();
+      //solrServer = new ConcurrentUpdateSolrServer(EXTERNAL_SOLR_SERVER_URL, 2, 2);
+      //solrServer = new SafeConcurrentUpdateSolrServer(EXTERNAL_SOLR_SERVER_URL, 2, 2);
+      //solrServer = new HttpSolrServer(EXTERNAL_SOLR_SERVER_URL);
+    } else {
+      if (TEST_WITH_EMBEDDED_SOLR_SERVER) {
+        solrServer = new TestEmbeddedSolrServer(h.getCoreContainer(), "");
+      } else {
+        throw new RuntimeException("Not yet implemented");
+        //solrServer = new TestSolrServer(getSolrServer());
+      }
+    }
+
+    Map<String, String> channelContext = new HashMap();
+    channelContext.put("capacity", "1000000");
+    channelContext.put("keep-alive", "0"); // for faster tests
+    Channel channel = new MemoryChannel();
+    channel.setName(channel.getClass().getName() + SEQ_NUM.getAndIncrement());
+    Configurables.configure(channel, new Context(channelContext));
+ 
+    class MySolrSink extends MorphlineSolrSink {
+      public MySolrSink(MorphlineHandlerImpl indexer) {
+        super(indexer);
+      }
+    }
+    
+    int batchSize = SEQ_NUM2.incrementAndGet() % 2 == 0 ? 100 : 1;
+    DocumentLoader testServer = new SolrServerDocumentLoader(solrServer, batchSize);
+    MorphlineContext solrMorphlineContext = new SolrMorphlineContext.Builder()
+        .setDocumentLoader(testServer)
+        .setExceptionHandler(new FaultTolerance(false, false, SolrServerException.class.getName()))
+        .setMetricRegistry(new MetricRegistry()).build();
+    
+    MorphlineHandlerImpl impl = new MorphlineHandlerImpl();
+    impl.setMorphlineContext(solrMorphlineContext);
+    
+    class MySolrLocator extends SolrLocator { // trick to access protected ctor
+      public MySolrLocator(MorphlineContext indexer) {
+        super(indexer);
+      }
+    }
+
+    SolrLocator locator = new MySolrLocator(solrMorphlineContext);
+    locator.setSolrHomeDir(testSolrHome + "/collection1");
+    String str1 = "SOLR_LOCATOR : " + locator.toString();
+    //File solrLocatorFile = new File("target/test-classes/test-morphlines/solrLocator.conf");
+    //String str1 = Files.toString(solrLocatorFile, Charsets.UTF_8);
+    File morphlineFile = new File("target/test-classes/test-morphlines/solrCellDocumentTypes.conf");
+    String str2 = Files.toString(morphlineFile, Charsets.UTF_8);
+    tmpFile = File.createTempFile("morphline", ".conf");
+    tmpFile.deleteOnExit();
+    Files.write(str1 + "\n" + str2, tmpFile, Charsets.UTF_8);    
+    context.put("morphlineFile", tmpFile.getPath());
+
+    impl.configure(new Context(context));
+    sink = new MySolrSink(impl);
+    sink.setName(sink.getClass().getName() + SEQ_NUM.getAndIncrement());
+    sink.configure(new Context(context));
+    sink.setChannel(channel);
+    sink.start();
+    
+    source = new EmbeddedSource(sink);    
+    ChannelSelector rcs = new ReplicatingChannelSelector();
+    rcs.setChannels(Collections.singletonList(channel));
+    ChannelProcessor chp = new ChannelProcessor(rcs);
+    Context chpContext = new Context();
+    chpContext.put("interceptors", "uuidinterceptor");
+    chpContext.put("interceptors.uuidinterceptor.type", UUIDInterceptor.Builder.class.getName());
+    chp.configure(chpContext);
+    source.setChannelProcessor(chp);
+    
+    deleteAllDocuments();
+  }
+  
+  private void deleteAllDocuments() throws SolrServerException, IOException {
+    SolrServer s = solrServer;
+    s.deleteByQuery("*:*"); // delete everything!
+    s.commit();
+  }
+
+  @After
+  @Override
+  public void tearDown() throws Exception {
+    try {
+      if (source != null) {
+        source.stop();
+        source = null;
+      }
+      if (sink != null) {
+        sink.stop();
+        sink = null;
+      }
+      if (tmpFile != null) {
+        tmpFile.delete();
+      }
+    } finally {
+      solrServer = null;
+      expectedRecords = null;
+      super.tearDown();
+    }
+  }
+
+  @Test
+  public void testDocumentTypes() throws Exception {
+    String path = RESOURCES_DIR + "/test-documents";
+    String[] files = new String[] {
+        path + "/testBMPfp.txt",
+        path + "/boilerplate.html",
+        path + "/NullHeader.docx",
+        path + "/testWORD_various.doc",          
+        path + "/testPDF.pdf",
+        path + "/testJPEG_EXIF.jpg",
+        path + "/testXML.xml",          
+//        path + "/cars.csv",
+//        path + "/cars.tsv",
+//        path + "/cars.ssv",
+//        path + "/cars.csv.gz",
+//        path + "/cars.tar.gz",
+        path + "/sample-statuses-20120906-141433.avro",
+        path + "/sample-statuses-20120906-141433",
+        path + "/sample-statuses-20120906-141433.gz",
+        path + "/sample-statuses-20120906-141433.bz2",
+    };
+    testDocumentTypesInternal(files);
+  }
+
+  @Test
+  public void testDocumentTypes2() throws Exception {
+    String path = RESOURCES_DIR + "/test-documents";
+    String[] files = new String[] {
+        path + "/testPPT_various.ppt",
+        path + "/testPPT_various.pptx",        
+        path + "/testEXCEL.xlsx",
+        path + "/testEXCEL.xls", 
+        path + "/testPages.pages", 
+        path + "/testNumbers.numbers", 
+        path + "/testKeynote.key",
+        
+        path + "/testRTFVarious.rtf", 
+        path + "/complex.mbox", 
+        path + "/test-outlook.msg", 
+        path + "/testEMLX.emlx",
+//        path + "/testRFC822",  
+        path + "/rsstest.rss", 
+//        path + "/testDITA.dita", 
+        
+        path + "/testMP3i18n.mp3", 
+        path + "/testAIFF.aif", 
+        path + "/testFLAC.flac", 
+//        path + "/testFLAC.oga", 
+//        path + "/testVORBIS.ogg",  
+        path + "/testMP4.m4a", 
+        path + "/testWAV.wav", 
+//        path + "/testWMA.wma", 
+        
+        path + "/testFLV.flv", 
+//        path + "/testWMV.wmv", 
+        
+        path + "/testBMP.bmp", 
+        path + "/testPNG.png", 
+        path + "/testPSD.psd",        
+        path + "/testSVG.svg",  
+        path + "/testTIFF.tif",     
+
+//        path + "/test-documents.7z", 
+//        path + "/test-documents.cpio",
+//        path + "/test-documents.tar", 
+//        path + "/test-documents.tbz2", 
+//        path + "/test-documents.tgz",
+//        path + "/test-documents.zip",
+//        path + "/test-zip-of-zip.zip",
+//        path + "/testJAR.jar",
+        
+//        path + "/testKML.kml", 
+//        path + "/testRDF.rdf", 
+        path + "/testTrueType.ttf", 
+        path + "/testVISIO.vsd",
+//        path + "/testWAR.war", 
+//        path + "/testWindows-x86-32.exe",
+//        path + "/testWINMAIL.dat", 
+//        path + "/testWMF.wmf", 
+    };   
+    testDocumentTypesInternal(files);
+  }
+
+  @Test
+  public void testAvroRoundTrip() throws Exception {
+    String file = RESOURCES_DIR + "/test-documents" + "/sample-statuses-20120906-141433.avro";
+    testDocumentTypesInternal(file);
+    QueryResponse rsp = query("*:*");
+    Iterator<SolrDocument> iter = rsp.getResults().iterator();
+    ListMultimap<String, String> expectedFieldValues;
+    expectedFieldValues = ImmutableListMultimap.of("id", "1234567890", "text", "sample tweet one",
+                                                   "user_screen_name", "fake_user1");
+    assertEquals(expectedFieldValues, next(iter));
+    expectedFieldValues = ImmutableListMultimap.of("id", "2345678901", "text", "sample tweet two",
+                                                   "user_screen_name", "fake_user2");
+    assertEquals(expectedFieldValues, next(iter));
+    assertFalse(iter.hasNext());
+  }
+  
+  private ListMultimap<String, Object> next(Iterator<SolrDocument> iter) {
+    SolrDocument doc = iter.next();
+    Record record = toRecord(doc);
+    record.removeAll("_version_"); // the values of this field are unknown and internal to solr
+    return record.getFields();    
+  }
+  
+  private Record toRecord(SolrDocument doc) {
+    Record record = new Record();
+    for (String key : doc.keySet()) {
+      record.getFields().replaceValues(key, doc.getFieldValues(key));        
+    }
+    return record;
+  }
+  
+  private void testDocumentTypesInternal(String... files) throws Exception {
+    int numDocs = 0;
+    long startTime = System.currentTimeMillis();
+    
+    assertEquals(numDocs, queryResultSetSize("*:*"));      
+//  assertQ(req("*:*"), "//*[@numFound='0']");
+    for (int i = 0; i < 1; i++) {      
+      for (String file : files) {
+        File f = new File(file);
+        byte[] body = Files.toByteArray(f);
+        Event event = EventBuilder.withBody(body);
+        event.getHeaders().put(Fields.ATTACHMENT_NAME, f.getName());
+        load(event);
+        Integer count = expectedRecords.get(file);
+        if (count != null) {
+          numDocs += count;
+        } else {
+          numDocs++;
+        }
+        assertEquals(numDocs, queryResultSetSize("*:*"));
+      }
+      LOGGER.trace("iter: {}", i);
+    }
+    LOGGER.trace("all done with put at {}", System.currentTimeMillis() - startTime);
+    assertEquals(numDocs, queryResultSetSize("*:*"));
+    LOGGER.trace("sink: ", sink);
+  }
+
+//  @Test
+  public void benchmarkDocumentTypes() throws Exception {
+    int iters = 200;
+    
+//    LogManager.getLogger(getClass().getPackage().getName()).setLevel(Level.INFO);
+    
+    assertEquals(0, queryResultSetSize("*:*"));      
+    String path = RESOURCES_DIR + "/test-documents";
+    String[] files = new String[] {
+//        path + "/testBMPfp.txt",
+//        path + "/boilerplate.html",
+//        path + "/NullHeader.docx",
+//        path + "/testWORD_various.doc",          
+//        path + "/testPDF.pdf",
+//        path + "/testJPEG_EXIF.jpg",
+//        path + "/testXML.xml",          
+//        path + "/cars.csv",
+//        path + "/cars.csv.gz",
+//        path + "/cars.tar.gz",
+//        path + "/sample-statuses-20120906-141433.avro",
+        path + "/sample-statuses-20120906-141433-medium.avro",
+    };
+    
+    List<Event> events = new ArrayList();
+    for (String file : files) {
+      File f = new File(file);
+      byte[] body = Files.toByteArray(f);
+      Event event = EventBuilder.withBody(body);
+//      event.getHeaders().put(Metadata.RESOURCE_NAME_KEY, f.getName());
+      events.add(event);
+    }
+    
+    long startTime = System.currentTimeMillis();
+    for (int i = 0; i < iters; i++) {
+      if (i % 10000 == 0) {
+        LOGGER.info("iter: {}", i);
+      }
+      for (Event event : events) {
+        event = EventBuilder.withBody(event.getBody(), new HashMap(event.getHeaders()));
+        event.getHeaders().put("id", UUID.randomUUID().toString());
+        load(event);
+      }
+    }
+    
+    float secs = (System.currentTimeMillis() - startTime) / 1000.0f;
+    long numDocs = queryResultSetSize("*:*");
+    LOGGER.info("Took secs: " + secs + ", iters/sec: " + (iters / secs));
+    LOGGER.info("Took secs: " + secs + ", docs/sec: " + (numDocs / secs));
+    LOGGER.info("Iterations: " + iters + ", numDocs: " + numDocs);
+    LOGGER.info("sink: ", sink);
+  }
+
+  private void load(Event event) throws EventDeliveryException {
+    source.load(event);
+  }
+
+  private void commit() throws SolrServerException, IOException {
+    solrServer.commit(false, true, true);
+  }
+  
+  private int queryResultSetSize(String query) throws SolrServerException, IOException {
+    commit();
+    QueryResponse rsp = query(query);
+    LOGGER.debug("rsp: {}", rsp);
+    int size = rsp.getResults().size();
+    return size;
+  }
+  
+  private QueryResponse query(String query) throws SolrServerException, IOException {
+    commit();
+    QueryResponse rsp = solrServer.query(new SolrQuery(query).setRows(Integer.MAX_VALUE));
+    LOGGER.debug("rsp: {}", rsp);
+    return rsp;
+  }
+  
+}
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/java/org/apache/flume/sink/solr/morphline/TestUUIDInterceptor.java b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/java/org/apache/flume/sink/solr/morphline/TestUUIDInterceptor.java
new file mode 100644
index 0000000..ceff028
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/java/org/apache/flume/sink/solr/morphline/TestUUIDInterceptor.java
@@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flume.sink.solr.morphline;
+
+import org.apache.flume.Context;
+import org.apache.flume.Event;
+import org.apache.flume.event.SimpleEvent;
+import org.junit.Assert;
+import org.junit.Test;
+
+public class TestUUIDInterceptor extends Assert {
+
+  private static final String ID = "id";
+
+  @Test
+  public void testBasic() throws Exception {
+    Context context = new Context();
+    context.put(UUIDInterceptor.HEADER_NAME, ID);
+    context.put(UUIDInterceptor.PRESERVE_EXISTING_NAME, "true");
+    Event event = new SimpleEvent();
+    assertTrue(build(context).intercept(event).getHeaders().get(ID).length() > 0);
+  }
+
+  @Test
+  public void testPreserveExisting() throws Exception {
+    Context context = new Context();
+    context.put(UUIDInterceptor.HEADER_NAME, ID);
+    context.put(UUIDInterceptor.PRESERVE_EXISTING_NAME, "true");
+    Event event = new SimpleEvent();
+    event.getHeaders().put(ID, "foo");
+    assertEquals("foo", build(context).intercept(event).getHeaders().get(ID));
+  }
+
+  @Test
+  public void testPrefix() throws Exception {
+    Context context = new Context();
+    context.put(UUIDInterceptor.HEADER_NAME, ID);
+    context.put(UUIDInterceptor.PREFIX_NAME, "bar#");
+    Event event = new SimpleEvent();
+    assertTrue(build(context).intercept(event).getHeaders().get(ID).startsWith("bar#"));
+  }
+
+  private UUIDInterceptor build(Context context) {
+    UUIDInterceptor.Builder builder = new UUIDInterceptor.Builder();
+    builder.configure(context);
+    return builder.build();
+  }
+
+}
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/custom-mimetypes.xml b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/custom-mimetypes.xml
new file mode 100644
index 0000000..4ee476a
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/custom-mimetypes.xml
@@ -0,0 +1,38 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!-- Licensed to the Apache Software Foundation (ASF) under one or more contributor
+  license agreements. See the NOTICE file distributed with this work for additional
+  information regarding copyright ownership. The ASF licenses this file to
+  You under the Apache License, Version 2.0 (the "License"); you may not use
+  this file except in compliance with the License. You may obtain a copy of
+  the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required
+  by applicable law or agreed to in writing, software distributed under the
+  License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS
+  OF ANY KIND, either express or implied. See the License for the specific
+  language governing permissions and limitations under the License. -->
+
+<mime-info>
+
+  <mime-type type="text/space-separated-values">
+    <glob pattern="*.ssv"/>
+  </mime-type>
+
+  <mime-type type="avro/binary">
+    <magic priority="50">
+      <match value="0x4f626a01" type="string" offset="0"/>
+    </magic>
+    <glob pattern="*.avro"/>
+  </mime-type>
+
+  <mime-type type="mytwittertest/json+delimited+length">
+    <magic priority="50">
+      <match value="[0-9]+(\r)?\n\\{&quot;" type="regex" offset="0:16"/>
+    </magic>
+  </mime-type>
+
+  <mime-type type="application/hadoop-sequence-file">
+    <magic priority="50">
+      <match value="SEQ[\0-\6]" type="regex" offset="0"/>
+    </magic>
+  </mime-type>
+
+</mime-info>
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/log4j.properties b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/log4j.properties
new file mode 100644
index 0000000..4bfd3fc
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/log4j.properties
@@ -0,0 +1,34 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#  http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+log4j.rootLogger=WARN, A1
+
+log4j.logger.org.apache.flume.sink=INFO
+#log4j.logger.org.apache.flume.sink.solr=DEBUG
+log4j.logger.org.apache.solr=INFO
+#log4j.logger.org.apache.solr.hadoop=DEBUG
+log4j.logger.org.kitesdk.morphline=DEBUG
+log4j.logger.org.apache.solr.morphline=DEBUG
+log4j.logger.org.apache.solr.update.processor.LogUpdateProcessor=WARN
+log4j.logger.org.apache.solr.core.SolrCore=WARN
+log4j.logger.org.apache.solr.search.SolrIndexSearcher=ERROR
+
+# A1 is set to be a ConsoleAppender.
+log4j.appender.A1=org.apache.log4j.ConsoleAppender
+
+# A1 uses PatternLayout.
+log4j.appender.A1.layout=org.apache.log4j.PatternLayout
+log4j.appender.A1.layout.ConversionPattern=%-4r [%t] %-5p %c %x - %m%n
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/currency.xml b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/currency.xml
new file mode 100644
index 0000000..654de41
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/currency.xml
@@ -0,0 +1,67 @@
+<?xml version="1.0" ?>
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements.  See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<!-- Example exchange rates file for CurrencyField type named "currency" in example schema -->
+
+<currencyConfig version="1.0">
+  <rates>
+    <!-- Updated from http://www.exchangerate.com/ at 2011-09-27 -->
+    <rate from="USD" to="ARS" rate="4.333871" comment="ARGENTINA Peso" />
+    <rate from="USD" to="AUD" rate="1.025768" comment="AUSTRALIA Dollar" />
+    <rate from="USD" to="EUR" rate="0.743676" comment="European Euro" />
+    <rate from="USD" to="BRL" rate="1.881093" comment="BRAZIL Real" />
+    <rate from="USD" to="CAD" rate="1.030815" comment="CANADA Dollar" />
+    <rate from="USD" to="CLP" rate="519.0996" comment="CHILE Peso" />
+    <rate from="USD" to="CNY" rate="6.387310" comment="CHINA Yuan" />
+    <rate from="USD" to="CZK" rate="18.47134" comment="CZECH REP. Koruna" />
+    <rate from="USD" to="DKK" rate="5.515436" comment="DENMARK Krone" />
+    <rate from="USD" to="HKD" rate="7.801922" comment="HONG KONG Dollar" />
+    <rate from="USD" to="HUF" rate="215.6169" comment="HUNGARY Forint" />
+    <rate from="USD" to="ISK" rate="118.1280" comment="ICELAND Krona" />
+    <rate from="USD" to="INR" rate="49.49088" comment="INDIA Rupee" />
+    <rate from="USD" to="XDR" rate="0.641358" comment="INTNL MON. FUND SDR" />
+    <rate from="USD" to="ILS" rate="3.709739" comment="ISRAEL Sheqel" />
+    <rate from="USD" to="JPY" rate="76.32419" comment="JAPAN Yen" />
+    <rate from="USD" to="KRW" rate="1169.173" comment="KOREA (SOUTH) Won" />
+    <rate from="USD" to="KWD" rate="0.275142" comment="KUWAIT Dinar" />
+    <rate from="USD" to="MXN" rate="13.85895" comment="MEXICO Peso" />
+    <rate from="USD" to="NZD" rate="1.285159" comment="NEW ZEALAND Dollar" />
+    <rate from="USD" to="NOK" rate="5.859035" comment="NORWAY Krone" />
+    <rate from="USD" to="PKR" rate="87.57007" comment="PAKISTAN Rupee" />
+    <rate from="USD" to="PEN" rate="2.730683" comment="PERU Sol" />
+    <rate from="USD" to="PHP" rate="43.62039" comment="PHILIPPINES Peso" />
+    <rate from="USD" to="PLN" rate="3.310139" comment="POLAND Zloty" />
+    <rate from="USD" to="RON" rate="3.100932" comment="ROMANIA Leu" />
+    <rate from="USD" to="RUB" rate="32.14663" comment="RUSSIA Ruble" />
+    <rate from="USD" to="SAR" rate="3.750465" comment="SAUDI ARABIA Riyal" />
+    <rate from="USD" to="SGD" rate="1.299352" comment="SINGAPORE Dollar" />
+    <rate from="USD" to="ZAR" rate="8.329761" comment="SOUTH AFRICA Rand" />
+    <rate from="USD" to="SEK" rate="6.883442" comment="SWEDEN Krona" />
+    <rate from="USD" to="CHF" rate="0.906035" comment="SWITZERLAND Franc" />
+    <rate from="USD" to="TWD" rate="30.40283" comment="TAIWAN Dollar" />
+    <rate from="USD" to="THB" rate="30.89487" comment="THAILAND Baht" />
+    <rate from="USD" to="AED" rate="3.672955" comment="U.A.E. Dirham" />
+    <rate from="USD" to="UAH" rate="7.988582" comment="UKRAINE Hryvnia" />
+    <rate from="USD" to="GBP" rate="0.647910" comment="UNITED KINGDOM Pound" />
+
+    <!-- Cross-rates for some common currencies -->
+    <rate from="EUR" to="GBP" rate="0.869914" />
+    <rate from="EUR" to="NOK" rate="7.800095" />
+    <rate from="GBP" to="NOK" rate="8.966508" />
+  </rates>
+</currencyConfig>
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/elevate.xml b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/elevate.xml
new file mode 100644
index 0000000..8f3aa80
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/elevate.xml
@@ -0,0 +1,38 @@
+<?xml version="1.0" encoding="UTF-8" ?>
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements.  See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<!-- If this file is found in the config directory, it will only be
+     loaded once at startup.  If it is found in Solr's data
+     directory, it will be re-loaded every commit.
+
+   See http://wiki.apache.org/solr/QueryElevationComponent for more info
+
+-->
+<elevate>
+ <query text="foo bar">
+  <doc id="1" />
+  <doc id="2" />
+  <doc id="3" />
+ </query>
+
+ <query text="ipod">
+   <doc id="MA147LL/A" />  <!-- put the actual ipod at the top -->
+   <doc id="IW-02" exclude="true" /> <!-- exclude this cable -->
+ </query>
+
+</elevate>
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/contractions_ca.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/contractions_ca.txt
new file mode 100644
index 0000000..307a85f
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/contractions_ca.txt
@@ -0,0 +1,8 @@
+# Set of Catalan contractions for ElisionFilter
+# TODO: load this as a resource from the analyzer and sync it in build.xml
+d
+l
+m
+n
+s
+t
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/contractions_fr.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/contractions_fr.txt
new file mode 100644
index 0000000..722db58
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/contractions_fr.txt
@@ -0,0 +1,9 @@
+# Set of French contractions for ElisionFilter
+# TODO: load this as a resource from the analyzer and sync it in build.xml
+l
+m
+t
+qu
+n
+s
+j
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/contractions_ga.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/contractions_ga.txt
new file mode 100644
index 0000000..9ebe7fa
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/contractions_ga.txt
@@ -0,0 +1,5 @@
+# Set of Irish contractions for ElisionFilter
+# TODO: load this as a resource from the analyzer and sync it in build.xml
+d
+m
+b
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/contractions_it.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/contractions_it.txt
new file mode 100644
index 0000000..cac0409
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/contractions_it.txt
@@ -0,0 +1,23 @@
+# Set of Italian contractions for ElisionFilter
+# TODO: load this as a resource from the analyzer and sync it in build.xml
+c
+l 
+all 
+dall 
+dell 
+nell 
+sull 
+coll 
+pell 
+gl 
+agl 
+dagl 
+degl 
+negl 
+sugl 
+un 
+m 
+t 
+s 
+v 
+d
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/hyphenations_ga.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/hyphenations_ga.txt
new file mode 100644
index 0000000..4d2642c
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/hyphenations_ga.txt
@@ -0,0 +1,5 @@
+# Set of Irish hyphenations for StopFilter
+# TODO: load this as a resource from the analyzer and sync it in build.xml
+h
+n
+t
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stemdict_nl.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stemdict_nl.txt
new file mode 100644
index 0000000..4410729
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stemdict_nl.txt
@@ -0,0 +1,6 @@
+# Set of overrides for the dutch stemmer
+# TODO: load this as a resource from the analyzer and sync it in build.xml
+fiets	fiets
+bromfiets	bromfiets
+ei	eier
+kind	kinder
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stoptags_ja.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stoptags_ja.txt
new file mode 100644
index 0000000..71b7508
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stoptags_ja.txt
@@ -0,0 +1,420 @@
+#
+# This file defines a Japanese stoptag set for JapanesePartOfSpeechStopFilter.
+#
+# Any token with a part-of-speech tag that exactly matches those defined in this
+# file are removed from the token stream.
+#
+# Set your own stoptags by uncommenting the lines below.  Note that comments are
+# not allowed on the same line as a stoptag.  See LUCENE-3745 for frequency lists,
+# etc. that can be useful for building you own stoptag set.
+#
+# The entire possible tagset is provided below for convenience.
+#
+#####
+#  noun: unclassified nouns
+#名詞
+#
+#  noun-common: Common nouns or nouns where the sub-classification is undefined
+#名詞-一般
+#
+#  noun-proper: Proper nouns where the sub-classification is undefined 
+#名詞-固有名詞
+#
+#  noun-proper-misc: miscellaneous proper nouns
+#名詞-固有名詞-一般
+#
+#  noun-proper-person: Personal names where the sub-classification is undefined
+#名詞-固有名詞-人名
+#
+#  noun-proper-person-misc: names that cannot be divided into surname and 
+#  given name; foreign names; names where the surname or given name is unknown.
+#  e.g. お市の方
+#名詞-固有名詞-人名-一般
+#
+#  noun-proper-person-surname: Mainly Japanese surnames.
+#  e.g. 山田
+#名詞-固有名詞-人名-姓
+#
+#  noun-proper-person-given_name: Mainly Japanese given names.
+#  e.g. 太郎
+#名詞-固有名詞-人名-名
+#
+#  noun-proper-organization: Names representing organizations.
+#  e.g. 通産省, NHK
+#名詞-固有名詞-組織
+#
+#  noun-proper-place: Place names where the sub-classification is undefined
+#名詞-固有名詞-地域
+#
+#  noun-proper-place-misc: Place names excluding countries.
+#  e.g. アジア, バルセロナ, 京都
+#名詞-固有名詞-地域-一般
+#
+#  noun-proper-place-country: Country names. 
+#  e.g. 日本, オーストラリア
+#名詞-固有名詞-地域-国
+#
+#  noun-pronoun: Pronouns where the sub-classification is undefined
+#名詞-代名詞
+#
+#  noun-pronoun-misc: miscellaneous pronouns: 
+#  e.g. それ, ここ, あいつ, あなた, あちこち, いくつ, どこか, なに, みなさん, みんな, わたくし, われわれ
+#名詞-代名詞-一般
+#
+#  noun-pronoun-contraction: Spoken language contraction made by combining a 
+#  pronoun and the particle 'wa'.
+#  e.g. ありゃ, こりゃ, こりゃあ, そりゃ, そりゃあ 
+#名詞-代名詞-縮約
+#
+#  noun-adverbial: Temporal nouns such as names of days or months that behave 
+#  like adverbs. Nouns that represent amount or ratios and can be used adverbially,
+#  e.g. 金曜, 一月, 午後, 少量
+#名詞-副詞可能
+#
+#  noun-verbal: Nouns that take arguments with case and can appear followed by 
+#  'suru' and related verbs (する, できる, なさる, くださる)
+#  e.g. インプット, 愛着, 悪化, 悪戦苦闘, 一安心, 下取り
+#名詞-サ変接続
+#
+#  noun-adjective-base: The base form of adjectives, words that appear before な ("na")
+#  e.g. 健康, 安易, 駄目, だめ
+#名詞-形容動詞語幹
+#
+#  noun-numeric: Arabic numbers, Chinese numerals, and counters like 何 (回), 数.
+#  e.g. 0, 1, 2, 何, 数, 幾
+#名詞-数
+#
+#  noun-affix: noun affixes where the sub-classification is undefined
+#名詞-非自立
+#
+#  noun-affix-misc: Of adnominalizers, the case-marker の ("no"), and words that 
+#  attach to the base form of inflectional words, words that cannot be classified 
+#  into any of the other categories below. This category includes indefinite nouns.
+#  e.g. あかつき, 暁, かい, 甲斐, 気, きらい, 嫌い, くせ, 癖, こと, 事, ごと, 毎, しだい, 次第, 
+#       順, せい, 所為, ついで, 序で, つもり, 積もり, 点, どころ, の, はず, 筈, はずみ, 弾み, 
+#       拍子, ふう, ふり, 振り, ほう, 方, 旨, もの, 物, 者, ゆえ, 故, ゆえん, 所以, わけ, 訳,
+#       わり, 割り, 割, ん-口語/, もん-口語/
+#名詞-非自立-一般
+#
+#  noun-affix-adverbial: noun affixes that that can behave as adverbs.
+#  e.g. あいだ, 間, あげく, 挙げ句, あと, 後, 余り, 以外, 以降, 以後, 以上, 以前, 一方, うえ, 
+#       上, うち, 内, おり, 折り, かぎり, 限り, きり, っきり, 結果, ころ, 頃, さい, 際, 最中, さなか, 
+#       最中, じたい, 自体, たび, 度, ため, 為, つど, 都度, とおり, 通り, とき, 時, ところ, 所, 
+#       とたん, 途端, なか, 中, のち, 後, ばあい, 場合, 日, ぶん, 分, ほか, 他, まえ, 前, まま, 
+#       儘, 侭, みぎり, 矢先
+#名詞-非自立-副詞可能
+#
+#  noun-affix-aux: noun affixes treated as 助動詞 ("auxiliary verb") in school grammars 
+#  with the stem よう(だ) ("you(da)").
+#  e.g.  よう, やう, 様 (よう)
+#名詞-非自立-助動詞語幹
+#  
+#  noun-affix-adjective-base: noun affixes that can connect to the indeclinable
+#  connection form な (aux "da").
+#  e.g. みたい, ふう
+#名詞-非自立-形容動詞語幹
+#
+#  noun-special: special nouns where the sub-classification is undefined.
+#名詞-特殊
+#
+#  noun-special-aux: The そうだ ("souda") stem form that is used for reporting news, is 
+#  treated as 助動詞 ("auxiliary verb") in school grammars, and attach to the base 
+#  form of inflectional words.
+#  e.g. そう
+#名詞-特殊-助動詞語幹
+#
+#  noun-suffix: noun suffixes where the sub-classification is undefined.
+#名詞-接尾
+#
+#  noun-suffix-misc: Of the nouns or stem forms of other parts of speech that connect 
+#  to ガル or タイ and can combine into compound nouns, words that cannot be classified into
+#  any of the other categories below. In general, this category is more inclusive than 
+#  接尾語 ("suffix") and is usually the last element in a compound noun.
+#  e.g. おき, かた, 方, 甲斐 (がい), がかり, ぎみ, 気味, ぐるみ, (～した) さ, 次第, 済 (ず) み,
+#       よう, (でき)っこ, 感, 観, 性, 学, 類, 面, 用
+#名詞-接尾-一般
+#
+#  noun-suffix-person: Suffixes that form nouns and attach to person names more often
+#  than other nouns.
+#  e.g. 君, 様, 著
+#名詞-接尾-人名
+#
+#  noun-suffix-place: Suffixes that form nouns and attach to place names more often 
+#  than other nouns.
+#  e.g. 町, 市, 県
+#名詞-接尾-地域
+#
+#  noun-suffix-verbal: Of the suffixes that attach to nouns and form nouns, those that 
+#  can appear before スル ("suru").
+#  e.g. 化, 視, 分け, 入り, 落ち, 買い
+#名詞-接尾-サ変接続
+#
+#  noun-suffix-aux: The stem form of そうだ (様態) that is used to indicate conditions, 
+#  is treated as 助動詞 ("auxiliary verb") in school grammars, and attach to the 
+#  conjunctive form of inflectional words.
+#  e.g. そう
+#名詞-接尾-助動詞語幹
+#
+#  noun-suffix-adjective-base: Suffixes that attach to other nouns or the conjunctive 
+#  form of inflectional words and appear before the copula だ ("da").
+#  e.g. 的, げ, がち
+#名詞-接尾-形容動詞語幹
+#
+#  noun-suffix-adverbial: Suffixes that attach to other nouns and can behave as adverbs.
+#  e.g. 後 (ご), 以後, 以降, 以前, 前後, 中, 末, 上, 時 (じ)
+#名詞-接尾-副詞可能
+#
+#  noun-suffix-classifier: Suffixes that attach to numbers and form nouns. This category 
+#  is more inclusive than 助数詞 ("classifier") and includes common nouns that attach 
+#  to numbers.
+#  e.g. 個, つ, 本, 冊, パーセント, cm, kg, カ月, か国, 区画, 時間, 時半
+#名詞-接尾-助数詞
+#
+#  noun-suffix-special: Special suffixes that mainly attach to inflecting words.
+#  e.g. (楽し) さ, (考え) 方
+#名詞-接尾-特殊
+#
+#  noun-suffix-conjunctive: Nouns that behave like conjunctions and join two words 
+#  together.
+#  e.g. (日本) 対 (アメリカ), 対 (アメリカ), (3) 対 (5), (女優) 兼 (主婦)
+#名詞-接続詞的
+#
+#  noun-verbal_aux: Nouns that attach to the conjunctive particle て ("te") and are 
+#  semantically verb-like.
+#  e.g. ごらん, ご覧, 御覧, 頂戴
+#名詞-動詞非自立的
+#
+#  noun-quotation: text that cannot be segmented into words, proverbs, Chinese poetry, 
+#  dialects, English, etc. Currently, the only entry for 名詞 引用文字列 ("noun quotation") 
+#  is いわく ("iwaku").
+#名詞-引用文字列
+#
+#  noun-nai_adjective: Words that appear before the auxiliary verb ない ("nai") and
+#  behave like an adjective.
+#  e.g. 申し訳, 仕方, とんでも, 違い
+#名詞-ナイ形容詞語幹
+#
+#####
+#  prefix: unclassified prefixes
+#接頭詞
+#
+#  prefix-nominal: Prefixes that attach to nouns (including adjective stem forms) 
+#  excluding numerical expressions.
+#  e.g. お (水), 某 (氏), 同 (社), 故 (～氏), 高 (品質), お (見事), ご (立派)
+#接頭詞-名詞接続
+#
+#  prefix-verbal: Prefixes that attach to the imperative form of a verb or a verb
+#  in conjunctive form followed by なる/なさる/くださる.
+#  e.g. お (読みなさい), お (座り)
+#接頭詞-動詞接続
+#
+#  prefix-adjectival: Prefixes that attach to adjectives.
+#  e.g. お (寒いですねえ), バカ (でかい)
+#接頭詞-形容詞接続
+#
+#  prefix-numerical: Prefixes that attach to numerical expressions.
+#  e.g. 約, およそ, 毎時
+#接頭詞-数接続
+#
+#####
+#  verb: unclassified verbs
+#動詞
+#
+#  verb-main:
+#動詞-自立
+#
+#  verb-auxiliary:
+#動詞-非自立
+#
+#  verb-suffix:
+#動詞-接尾
+#
+#####
+#  adjective: unclassified adjectives
+#形容詞
+#
+#  adjective-main:
+#形容詞-自立
+#
+#  adjective-auxiliary:
+#形容詞-非自立
+#
+#  adjective-suffix:
+#形容詞-接尾
+#
+#####
+#  adverb: unclassified adverbs
+#副詞
+#
+#  adverb-misc: Words that can be segmented into one unit and where adnominal 
+#  modification is not possible.
+#  e.g. あいかわらず, 多分
+#副詞-一般
+#
+#  adverb-particle_conjunction: Adverbs that can be followed by の, は, に, 
+#  な, する, だ, etc.
+#  e.g. こんなに, そんなに, あんなに, なにか, なんでも
+#副詞-助詞類接続
+#
+#####
+#  adnominal: Words that only have noun-modifying forms.
+#  e.g. この, その, あの, どの, いわゆる, なんらかの, 何らかの, いろんな, こういう, そういう, ああいう, 
+#       どういう, こんな, そんな, あんな, どんな, 大きな, 小さな, おかしな, ほんの, たいした, 
+#       「(, も) さる (ことながら)」, 微々たる, 堂々たる, 単なる, いかなる, 我が」「同じ, 亡き
+#連体詞
+#
+#####
+#  conjunction: Conjunctions that can occur independently.
+#  e.g. が, けれども, そして, じゃあ, それどころか
+接続詞
+#
+#####
+#  particle: unclassified particles.
+助詞
+#
+#  particle-case: case particles where the subclassification is undefined.
+助詞-格助詞
+#
+#  particle-case-misc: Case particles.
+#  e.g. から, が, で, と, に, へ, より, を, の, にて
+助詞-格助詞-一般
+#
+#  particle-case-quote: the "to" that appears after nouns, a person’s speech, 
+#  quotation marks, expressions of decisions from a meeting, reasons, judgements,
+#  conjectures, etc.
+#  e.g. ( だ) と (述べた.), ( である) と (して執行猶予...)
+助詞-格助詞-引用
+#
+#  particle-case-compound: Compounds of particles and verbs that mainly behave 
+#  like case particles.
+#  e.g. という, といった, とかいう, として, とともに, と共に, でもって, にあたって, に当たって, に当って,
+#       にあたり, に当たり, に当り, に当たる, にあたる, において, に於いて,に於て, における, に於ける, 
+#       にかけ, にかけて, にかんし, に関し, にかんして, に関して, にかんする, に関する, に際し, 
+#       に際して, にしたがい, に従い, に従う, にしたがって, に従って, にたいし, に対し, にたいして, 
+#       に対して, にたいする, に対する, について, につき, につけ, につけて, につれ, につれて, にとって,
+#       にとり, にまつわる, によって, に依って, に因って, により, に依り, に因り, による, に依る, に因る, 
+#       にわたって, にわたる, をもって, を以って, を通じ, を通じて, を通して, をめぐって, をめぐり, をめぐる,
+#       って-口語/, ちゅう-関西弁「という」/, (何) ていう (人)-口語/, っていう-口語/, といふ, とかいふ
+助詞-格助詞-連語
+#
+#  particle-conjunctive:
+#  e.g. から, からには, が, けれど, けれども, けど, し, つつ, て, で, と, ところが, どころか, とも, ども, 
+#       ながら, なり, ので, のに, ば, ものの, や ( した), やいなや, (ころん) じゃ(いけない)-口語/, 
+#       (行っ) ちゃ(いけない)-口語/, (言っ) たって (しかたがない)-口語/, (それがなく)ったって (平気)-口語/
+助詞-接続助詞
+#
+#  particle-dependency:
+#  e.g. こそ, さえ, しか, すら, は, も, ぞ
+助詞-係助詞
+#
+#  particle-adverbial:
+#  e.g. がてら, かも, くらい, 位, ぐらい, しも, (学校) じゃ(これが流行っている)-口語/, 
+#       (それ)じゃあ (よくない)-口語/, ずつ, (私) なぞ, など, (私) なり (に), (先生) なんか (大嫌い)-口語/,
+#       (私) なんぞ, (先生) なんて (大嫌い)-口語/, のみ, だけ, (私) だって-口語/, だに, 
+#       (彼)ったら-口語/, (お茶) でも (いかが), 等 (とう), (今後) とも, ばかり, ばっか-口語/, ばっかり-口語/,
+#       ほど, 程, まで, 迄, (誰) も (が)([助詞-格助詞] および [助詞-係助詞] の前に位置する「も」)
+助詞-副助詞
+#
+#  particle-interjective: particles with interjective grammatical roles.
+#  e.g. (松島) や
+助詞-間投助詞
+#
+#  particle-coordinate:
+#  e.g. と, たり, だの, だり, とか, なり, や, やら
+助詞-並立助詞
+#
+#  particle-final:
+#  e.g. かい, かしら, さ, ぜ, (だ)っけ-口語/, (とまってる) で-方言/, な, ナ, なあ-口語/, ぞ, ね, ネ, 
+#       ねぇ-口語/, ねえ-口語/, ねん-方言/, の, のう-口語/, や, よ, ヨ, よぉ-口語/, わ, わい-口語/
+助詞-終助詞
+#
+#  particle-adverbial/conjunctive/final: The particle "ka" when unknown whether it is 
+#  adverbial, conjunctive, or sentence final. For example:
+#       (a) 「A か B か」. Ex:「(国内で運用する) か,(海外で運用する) か (.)」
+#       (b) Inside an adverb phrase. Ex:「(幸いという) か (, 死者はいなかった.)」
+#           「(祈りが届いたせい) か (, 試験に合格した.)」
+#       (c) 「かのように」. Ex:「(何もなかった) か (のように振る舞った.)」
+#  e.g. か
+助詞-副助詞／並立助詞／終助詞
+#
+#  particle-adnominalizer: The "no" that attaches to nouns and modifies 
+#  non-inflectional words.
+助詞-連体化
+#
+#  particle-adnominalizer: The "ni" and "to" that appear following nouns and adverbs 
+#  that are giongo, giseigo, or gitaigo.
+#  e.g. に, と
+助詞-副詞化
+#
+#  particle-special: A particle that does not fit into one of the above classifications. 
+#  This includes particles that are used in Tanka, Haiku, and other poetry.
+#  e.g. かな, けむ, ( しただろう) に, (あんた) にゃ(わからん), (俺) ん (家)
+助詞-特殊
+#
+#####
+#  auxiliary-verb:
+助動詞
+#
+#####
+#  interjection: Greetings and other exclamations.
+#  e.g. おはよう, おはようございます, こんにちは, こんばんは, ありがとう, どうもありがとう, ありがとうございます, 
+#       いただきます, ごちそうさま, さよなら, さようなら, はい, いいえ, ごめん, ごめんなさい
+#感動詞
+#
+#####
+#  symbol: unclassified Symbols.
+記号
+#
+#  symbol-misc: A general symbol not in one of the categories below.
+#  e.g. [○◎@$〒→+]
+記号-一般
+#
+#  symbol-comma: Commas
+#  e.g. [,、]
+記号-読点
+#
+#  symbol-period: Periods and full stops.
+#  e.g. [.．。]
+記号-句点
+#
+#  symbol-space: Full-width whitespace.
+記号-空白
+#
+#  symbol-open_bracket:
+#  e.g. [({‘“『【]
+記号-括弧開
+#
+#  symbol-close_bracket:
+#  e.g. [)}’”』」】]
+記号-括弧閉
+#
+#  symbol-alphabetic:
+#記号-アルファベット
+#
+#####
+#  other: unclassified other
+#その他
+#
+#  other-interjection: Words that are hard to classify as noun-suffixes or 
+#  sentence-final particles.
+#  e.g. (だ)ァ
+その他-間投
+#
+#####
+#  filler: Aizuchi that occurs during a conversation or sounds inserted as filler.
+#  e.g. あの, うんと, えと
+フィラー
+#
+#####
+#  non-verbal: non-verbal sound.
+非言語音
+#
+#####
+#  fragment:
+#語断片
+#
+#####
+#  unknown: unknown part of speech.
+#未知語
+#
+##### End of file
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_ar.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_ar.txt
new file mode 100644
index 0000000..046829d
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_ar.txt
@@ -0,0 +1,125 @@
+# This file was created by Jacques Savoy and is distributed under the BSD license.
+# See http://members.unine.ch/jacques.savoy/clef/index.html.
+# Also see http://www.opensource.org/licenses/bsd-license.html
+# Cleaned on October 11, 2009 (not normalized, so use before normalization)
+# This means that when modifying this list, you might need to add some 
+# redundant entries, for example containing forms with both أ and ا
+من
+ومن
+منها
+منه
+في
+وفي
+فيها
+فيه
+و
+ف
+ثم
+او
+أو
+ب
+بها
+به
+ا
+أ
+اى
+اي
+أي
+أى
+لا
+ولا
+الا
+ألا
+إلا
+لكن
+ما
+وما
+كما
+فما
+عن
+مع
+اذا
+إذا
+ان
+أن
+إن
+انها
+أنها
+إنها
+انه
+أنه
+إنه
+بان
+بأن
+فان
+فأن
+وان
+وأن
+وإن
+التى
+التي
+الذى
+الذي
+الذين
+الى
+الي
+إلى
+إلي
+على
+عليها
+عليه
+اما
+أما
+إما
+ايضا
+أيضا
+كل
+وكل
+لم
+ولم
+لن
+ولن
+هى
+هي
+هو
+وهى
+وهي
+وهو
+فهى
+فهي
+فهو
+انت
+أنت
+لك
+لها
+له
+هذه
+هذا
+تلك
+ذلك
+هناك
+كانت
+كان
+يكون
+تكون
+وكانت
+وكان
+غير
+بعض
+قد
+نحو
+بين
+بينما
+منذ
+ضمن
+حيث
+الان
+الآن
+خلال
+بعد
+قبل
+حتى
+عند
+عندما
+لدى
+جميع
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_bg.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_bg.txt
new file mode 100644
index 0000000..1ae4ba2
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_bg.txt
@@ -0,0 +1,193 @@
+# This file was created by Jacques Savoy and is distributed under the BSD license.
+# See http://members.unine.ch/jacques.savoy/clef/index.html.
+# Also see http://www.opensource.org/licenses/bsd-license.html
+а
+аз
+ако
+ала
+бе
+без
+беше
+би
+бил
+била
+били
+било
+близо
+бъдат
+бъде
+бяха
+в
+вас
+ваш
+ваша
+вероятно
+вече
+взема
+ви
+вие
+винаги
+все
+всеки
+всички
+всичко
+всяка
+във
+въпреки
+върху
+г
+ги
+главно
+го
+д
+да
+дали
+до
+докато
+докога
+дори
+досега
+доста
+е
+едва
+един
+ето
+за
+зад
+заедно
+заради
+засега
+затова
+защо
+защото
+и
+из
+или
+им
+има
+имат
+иска
+й
+каза
+как
+каква
+какво
+както
+какъв
+като
+кога
+когато
+което
+които
+кой
+който
+колко
+която
+къде
+където
+към
+ли
+м
+ме
+между
+мен
+ми
+мнозина
+мога
+могат
+може
+моля
+момента
+му
+н
+на
+над
+назад
+най
+направи
+напред
+например
+нас
+не
+него
+нея
+ни
+ние
+никой
+нито
+но
+някои
+някой
+няма
+обаче
+около
+освен
+особено
+от
+отгоре
+отново
+още
+пак
+по
+повече
+повечето
+под
+поне
+поради
+после
+почти
+прави
+пред
+преди
+през
+при
+пък
+първо
+с
+са
+само
+се
+сега
+си
+скоро
+след
+сме
+според
+сред
+срещу
+сте
+съм
+със
+също
+т
+тази
+така
+такива
+такъв
+там
+твой
+те
+тези
+ти
+тн
+то
+това
+тогава
+този
+той
+толкова
+точно
+трябва
+тук
+тъй
+тя
+тях
+у
+харесва
+ч
+че
+често
+чрез
+ще
+щом
+я
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_ca.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_ca.txt
new file mode 100644
index 0000000..3da65de
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_ca.txt
@@ -0,0 +1,220 @@
+# Catalan stopwords from http://github.com/vcl/cue.language (Apache 2 Licensed)
+a
+abans
+ací
+ah
+així
+això
+al
+als
+aleshores
+algun
+alguna
+algunes
+alguns
+alhora
+allà
+allí
+allò
+altra
+altre
+altres
+amb
+ambdós
+ambdues
+apa
+aquell
+aquella
+aquelles
+aquells
+aquest
+aquesta
+aquestes
+aquests
+aquí
+baix
+cada
+cadascú
+cadascuna
+cadascunes
+cadascuns
+com
+contra
+d'un
+d'una
+d'unes
+d'uns
+dalt
+de
+del
+dels
+des
+després
+dins
+dintre
+donat
+doncs
+durant
+e
+eh
+el
+els
+em
+en
+encara
+ens
+entre
+érem
+eren
+éreu
+es
+és
+esta
+està
+estàvem
+estaven
+estàveu
+esteu
+et
+etc
+ets
+fins
+fora
+gairebé
+ha
+han
+has
+havia
+he
+hem
+heu
+hi 
+ho
+i
+igual
+iguals
+ja
+l'hi
+la
+les
+li
+li'n
+llavors
+m'he
+ma
+mal
+malgrat
+mateix
+mateixa
+mateixes
+mateixos
+me
+mentre
+més
+meu
+meus
+meva
+meves
+molt
+molta
+moltes
+molts
+mon
+mons
+n'he
+n'hi
+ne
+ni
+no
+nogensmenys
+només
+nosaltres
+nostra
+nostre
+nostres
+o
+oh
+oi
+on
+pas
+pel
+pels
+per
+però
+perquè
+poc 
+poca
+pocs
+poques
+potser
+propi
+qual
+quals
+quan
+quant 
+que
+què
+quelcom
+qui
+quin
+quina
+quines
+quins
+s'ha
+s'han
+sa
+semblant
+semblants
+ses
+seu 
+seus
+seva
+seva
+seves
+si
+sobre
+sobretot
+sóc
+solament
+sols
+son 
+són
+sons 
+sota
+sou
+t'ha
+t'han
+t'he
+ta
+tal
+també
+tampoc
+tan
+tant
+tanta
+tantes
+teu
+teus
+teva
+teves
+ton
+tons
+tot
+tota
+totes
+tots
+un
+una
+unes
+uns
+us
+va
+vaig
+vam
+van
+vas
+veu
+vosaltres
+vostra
+vostre
+vostres
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_cz.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_cz.txt
new file mode 100644
index 0000000..53c6097
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_cz.txt
@@ -0,0 +1,172 @@
+a
+s
+k
+o
+i
+u
+v
+z
+dnes
+cz
+tímto
+budeš
+budem
+byli
+jseš
+můj
+svým
+ta
+tomto
+tohle
+tuto
+tyto
+jej
+zda
+proč
+máte
+tato
+kam
+tohoto
+kdo
+kteří
+mi
+nám
+tom
+tomuto
+mít
+nic
+proto
+kterou
+byla
+toho
+protože
+asi
+ho
+naši
+napište
+re
+což
+tím
+takže
+svých
+její
+svými
+jste
+aj
+tu
+tedy
+teto
+bylo
+kde
+ke
+pravé
+ji
+nad
+nejsou
+či
+pod
+téma
+mezi
+přes
+ty
+pak
+vám
+ani
+když
+však
+neg
+jsem
+tento
+článku
+články
+aby
+jsme
+před
+pta
+jejich
+byl
+ještě
+až
+bez
+také
+pouze
+první
+vaše
+která
+nás
+nový
+tipy
+pokud
+může
+strana
+jeho
+své
+jiné
+zprávy
+nové
+není
+vás
+jen
+podle
+zde
+už
+být
+více
+bude
+již
+než
+který
+by
+které
+co
+nebo
+ten
+tak
+má
+při
+od
+po
+jsou
+jak
+další
+ale
+si
+se
+ve
+to
+jako
+za
+zpět
+ze
+do
+pro
+je
+na
+atd
+atp
+jakmile
+přičemž
+já
+on
+ona
+ono
+oni
+ony
+my
+vy
+jí
+ji
+mě
+mne
+jemu
+tomu
+těm
+těmu
+němu
+němuž
+jehož
+jíž
+jelikož
+jež
+jakož
+načež
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_da.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_da.txt
new file mode 100644
index 0000000..a3ff5fe
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_da.txt
@@ -0,0 +1,108 @@
+ | From svn.tartarus.org/snowball/trunk/website/algorithms/danish/stop.txt
+ | This file is distributed under the BSD License.
+ | See http://snowball.tartarus.org/license.php
+ | Also see http://www.opensource.org/licenses/bsd-license.html
+ |  - Encoding was converted to UTF-8.
+ |  - This notice was added.
+
+ | A Danish stop word list. Comments begin with vertical bar. Each stop
+ | word is at the start of a line.
+
+ | This is a ranked list (commonest to rarest) of stopwords derived from
+ | a large text sample.
+
+
+og           | and
+i            | in
+jeg          | I
+det          | that (dem. pronoun)/it (pers. pronoun)
+at           | that (in front of a sentence)/to (with infinitive)
+en           | a/an
+den          | it (pers. pronoun)/that (dem. pronoun)
+til          | to/at/for/until/against/by/of/into, more
+er           | present tense of "to be"
+som          | who, as
+på           | on/upon/in/on/at/to/after/of/with/for, on
+de           | they
+med          | with/by/in, along
+han          | he
+af           | of/by/from/off/for/in/with/on, off
+for          | at/for/to/from/by/of/ago, in front/before, because
+ikke         | not
+der          | who/which, there/those
+var          | past tense of "to be"
+mig          | me/myself
+sig          | oneself/himself/herself/itself/themselves
+men          | but
+et           | a/an/one, one (number), someone/somebody/one
+har          | present tense of "to have"
+om           | round/about/for/in/a, about/around/down, if
+vi           | we
+min          | my
+havde        | past tense of "to have"
+ham          | him
+hun          | she
+nu           | now
+over         | over/above/across/by/beyond/past/on/about, over/past
+da           | then, when/as/since
+fra          | from/off/since, off, since
+du           | you
+ud           | out
+sin          | his/her/its/one's
+dem          | them
+os           | us/ourselves
+op           | up
+man          | you/one
+hans         | his
+hvor         | where
+eller        | or
+hvad         | what
+skal         | must/shall etc.
+selv         | myself/youself/herself/ourselves etc., even
+her          | here
+alle         | all/everyone/everybody etc.
+vil          | will (verb)
+blev         | past tense of "to stay/to remain/to get/to become"
+kunne        | could
+ind          | in
+når          | when
+være         | present tense of "to be"
+dog          | however/yet/after all
+noget        | something
+ville        | would
+jo           | you know/you see (adv), yes
+deres        | their/theirs
+efter        | after/behind/according to/for/by/from, later/afterwards
+ned          | down
+skulle       | should
+denne        | this
+end          | than
+dette        | this
+mit          | my/mine
+også         | also
+under        | under/beneath/below/during, below/underneath
+have         | have
+dig          | you
+anden        | other
+hende        | her
+mine         | my
+alt          | everything
+meget        | much/very, plenty of
+sit          | his, her, its, one's
+sine         | his, her, its, one's
+vor          | our
+mod          | against
+disse        | these
+hvis         | if
+din          | your/yours
+nogle        | some
+hos          | by/at
+blive        | be/become
+mange        | many
+ad           | by/through
+bliver       | present tense of "to be/to become"
+hendes       | her/hers
+været        | be
+thi          | for (conj)
+jer          | you
+sådan        | such, like this/like that
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_de.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_de.txt
new file mode 100644
index 0000000..f770384
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_de.txt
@@ -0,0 +1,292 @@
+ | From svn.tartarus.org/snowball/trunk/website/algorithms/german/stop.txt
+ | This file is distributed under the BSD License.
+ | See http://snowball.tartarus.org/license.php
+ | Also see http://www.opensource.org/licenses/bsd-license.html
+ |  - Encoding was converted to UTF-8.
+ |  - This notice was added.
+
+ | A German stop word list. Comments begin with vertical bar. Each stop
+ | word is at the start of a line.
+
+ | The number of forms in this list is reduced significantly by passing it
+ | through the German stemmer.
+
+
+aber           |  but
+
+alle           |  all
+allem
+allen
+aller
+alles
+
+als            |  than, as
+also           |  so
+am             |  an + dem
+an             |  at
+
+ander          |  other
+andere
+anderem
+anderen
+anderer
+anderes
+anderm
+andern
+anderr
+anders
+
+auch           |  also
+auf            |  on
+aus            |  out of
+bei            |  by
+bin            |  am
+bis            |  until
+bist           |  art
+da             |  there
+damit          |  with it
+dann           |  then
+
+der            |  the
+den
+des
+dem
+die
+das
+
+daß            |  that
+
+derselbe       |  the same
+derselben
+denselben
+desselben
+demselben
+dieselbe
+dieselben
+dasselbe
+
+dazu           |  to that
+
+dein           |  thy
+deine
+deinem
+deinen
+deiner
+deines
+
+denn           |  because
+
+derer          |  of those
+dessen         |  of him
+
+dich           |  thee
+dir            |  to thee
+du             |  thou
+
+dies           |  this
+diese
+diesem
+diesen
+dieser
+dieses
+
+
+doch           |  (several meanings)
+dort           |  (over) there
+
+
+durch          |  through
+
+ein            |  a
+eine
+einem
+einen
+einer
+eines
+
+einig          |  some
+einige
+einigem
+einigen
+einiger
+einiges
+
+einmal         |  once
+
+er             |  he
+ihn            |  him
+ihm            |  to him
+
+es             |  it
+etwas          |  something
+
+euer           |  your
+eure
+eurem
+euren
+eurer
+eures
+
+für            |  for
+gegen          |  towards
+gewesen        |  p.p. of sein
+hab            |  have
+habe           |  have
+haben          |  have
+hat            |  has
+hatte          |  had
+hatten         |  had
+hier           |  here
+hin            |  there
+hinter         |  behind
+
+ich            |  I
+mich           |  me
+mir            |  to me
+
+
+ihr            |  you, to her
+ihre
+ihrem
+ihren
+ihrer
+ihres
+euch           |  to you
+
+im             |  in + dem
+in             |  in
+indem          |  while
+ins            |  in + das
+ist            |  is
+
+jede           |  each, every
+jedem
+jeden
+jeder
+jedes
+
+jene           |  that
+jenem
+jenen
+jener
+jenes
+
+jetzt          |  now
+kann           |  can
+
+kein           |  no
+keine
+keinem
+keinen
+keiner
+keines
+
+können         |  can
+könnte         |  could
+machen         |  do
+man            |  one
+
+manche         |  some, many a
+manchem
+manchen
+mancher
+manches
+
+mein           |  my
+meine
+meinem
+meinen
+meiner
+meines
+
+mit            |  with
+muss           |  must
+musste         |  had to
+nach           |  to(wards)
+nicht          |  not
+nichts         |  nothing
+noch           |  still, yet
+nun            |  now
+nur            |  only
+ob             |  whether
+oder           |  or
+ohne           |  without
+sehr           |  very
+
+sein           |  his
+seine
+seinem
+seinen
+seiner
+seines
+
+selbst         |  self
+sich           |  herself
+
+sie            |  they, she
+ihnen          |  to them
+
+sind           |  are
+so             |  so
+
+solche         |  such
+solchem
+solchen
+solcher
+solches
+
+soll           |  shall
+sollte         |  should
+sondern        |  but
+sonst          |  else
+über           |  over
+um             |  about, around
+und            |  and
+
+uns            |  us
+unse
+unsem
+unsen
+unser
+unses
+
+unter          |  under
+viel           |  much
+vom            |  von + dem
+von            |  from
+vor            |  before
+während        |  while
+war            |  was
+waren          |  were
+warst          |  wast
+was            |  what
+weg            |  away, off
+weil           |  because
+weiter         |  further
+
+welche         |  which
+welchem
+welchen
+welcher
+welches
+
+wenn           |  when
+werde          |  will
+werden         |  will
+wie            |  how
+wieder         |  again
+will           |  want
+wir            |  we
+wird           |  will
+wirst          |  willst
+wo             |  where
+wollen         |  want
+wollte         |  wanted
+würde          |  would
+würden         |  would
+zu             |  to
+zum            |  zu + dem
+zur            |  zu + der
+zwar           |  indeed
+zwischen       |  between
+
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_el.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_el.txt
new file mode 100644
index 0000000..232681f
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_el.txt
@@ -0,0 +1,78 @@
+# Lucene Greek Stopwords list
+# Note: by default this file is used after GreekLowerCaseFilter,
+# so when modifying this file use 'σ' instead of 'ς' 
+ο
+η
+το
+οι
+τα
+του
+τησ
+των
+τον
+την
+και 
+κι
+κ
+ειμαι
+εισαι
+ειναι
+ειμαστε
+ειστε
+στο
+στον
+στη
+στην
+μα
+αλλα
+απο
+για
+προσ
+με
+σε
+ωσ
+παρα
+αντι
+κατα
+μετα
+θα
+να
+δε
+δεν
+μη
+μην
+επι
+ενω
+εαν
+αν
+τοτε
+που
+πωσ
+ποιοσ
+ποια
+ποιο
+ποιοι
+ποιεσ
+ποιων
+ποιουσ
+αυτοσ
+αυτη
+αυτο
+αυτοι
+αυτων
+αυτουσ
+αυτεσ
+αυτα
+εκεινοσ
+εκεινη
+εκεινο
+εκεινοι
+εκεινεσ
+εκεινα
+εκεινων
+εκεινουσ
+οπωσ
+ομωσ
+ισωσ
+οσο
+οτι
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_en.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_en.txt
new file mode 100644
index 0000000..2c164c0
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_en.txt
@@ -0,0 +1,54 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# a couple of test stopwords to test that the words are really being
+# configured from this file:
+stopworda
+stopwordb
+
+# Standard english stop words taken from Lucene's StopAnalyzer
+a
+an
+and
+are
+as
+at
+be
+but
+by
+for
+if
+in
+into
+is
+it
+no
+not
+of
+on
+or
+such
+that
+the
+their
+then
+there
+these
+they
+this
+to
+was
+will
+with
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_es.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_es.txt
new file mode 100644
index 0000000..2db1476
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_es.txt
@@ -0,0 +1,354 @@
+ | From svn.tartarus.org/snowball/trunk/website/algorithms/spanish/stop.txt
+ | This file is distributed under the BSD License.
+ | See http://snowball.tartarus.org/license.php
+ | Also see http://www.opensource.org/licenses/bsd-license.html
+ |  - Encoding was converted to UTF-8.
+ |  - This notice was added.
+
+ | A Spanish stop word list. Comments begin with vertical bar. Each stop
+ | word is at the start of a line.
+
+
+ | The following is a ranked list (commonest to rarest) of stopwords
+ | deriving from a large sample of text.
+
+ | Extra words have been added at the end.
+
+de             |  from, of
+la             |  the, her
+que            |  who, that
+el             |  the
+en             |  in
+y              |  and
+a              |  to
+los            |  the, them
+del            |  de + el
+se             |  himself, from him etc
+las            |  the, them
+por            |  for, by, etc
+un             |  a
+para           |  for
+con            |  with
+no             |  no
+una            |  a
+su             |  his, her
+al             |  a + el
+  | es         from SER
+lo             |  him
+como           |  how
+más            |  more
+pero           |  pero
+sus            |  su plural
+le             |  to him, her
+ya             |  already
+o              |  or
+  | fue        from SER
+este           |  this
+  | ha         from HABER
+sí             |  himself etc
+porque         |  because
+esta           |  this
+  | son        from SER
+entre          |  between
+  | está     from ESTAR
+cuando         |  when
+muy            |  very
+sin            |  without
+sobre          |  on
+  | ser        from SER
+  | tiene      from TENER
+también        |  also
+me             |  me
+hasta          |  until
+hay            |  there is/are
+donde          |  where
+  | han        from HABER
+quien          |  whom, that
+  | están      from ESTAR
+  | estado     from ESTAR
+desde          |  from
+todo           |  all
+nos            |  us
+durante        |  during
+  | estados    from ESTAR
+todos          |  all
+uno            |  a
+les            |  to them
+ni             |  nor
+contra         |  against
+otros          |  other
+  | fueron     from SER
+ese            |  that
+eso            |  that
+  | había      from HABER
+ante           |  before
+ellos          |  they
+e              |  and (variant of y)
+esto           |  this
+mí             |  me
+antes          |  before
+algunos        |  some
+qué            |  what?
+unos           |  a
+yo             |  I
+otro           |  other
+otras          |  other
+otra           |  other
+él             |  he
+tanto          |  so much, many
+esa            |  that
+estos          |  these
+mucho          |  much, many
+quienes        |  who
+nada           |  nothing
+muchos         |  many
+cual           |  who
+  | sea        from SER
+poco           |  few
+ella           |  she
+estar          |  to be
+  | haber      from HABER
+estas          |  these
+  | estaba     from ESTAR
+  | estamos    from ESTAR
+algunas        |  some
+algo           |  something
+nosotros       |  we
+
+      | other forms
+
+mi             |  me
+mis            |  mi plural
+tú             |  thou
+te             |  thee
+ti             |  thee
+tu             |  thy
+tus            |  tu plural
+ellas          |  they
+nosotras       |  we
+vosotros       |  you
+vosotras       |  you
+os             |  you
+mío            |  mine
+mía            |
+míos           |
+mías           |
+tuyo           |  thine
+tuya           |
+tuyos          |
+tuyas          |
+suyo           |  his, hers, theirs
+suya           |
+suyos          |
+suyas          |
+nuestro        |  ours
+nuestra        |
+nuestros       |
+nuestras       |
+vuestro        |  yours
+vuestra        |
+vuestros       |
+vuestras       |
+esos           |  those
+esas           |  those
+
+               | forms of estar, to be (not including the infinitive):
+estoy
+estás
+está
+estamos
+estáis
+están
+esté
+estés
+estemos
+estéis
+estén
+estaré
+estarás
+estará
+estaremos
+estaréis
+estarán
+estaría
+estarías
+estaríamos
+estaríais
+estarían
+estaba
+estabas
+estábamos
+estabais
+estaban
+estuve
+estuviste
+estuvo
+estuvimos
+estuvisteis
+estuvieron
+estuviera
+estuvieras
+estuviéramos
+estuvierais
+estuvieran
+estuviese
+estuvieses
+estuviésemos
+estuvieseis
+estuviesen
+estando
+estado
+estada
+estados
+estadas
+estad
+
+               | forms of haber, to have (not including the infinitive):
+he
+has
+ha
+hemos
+habéis
+han
+haya
+hayas
+hayamos
+hayáis
+hayan
+habré
+habrás
+habrá
+habremos
+habréis
+habrán
+habría
+habrías
+habríamos
+habríais
+habrían
+había
+habías
+habíamos
+habíais
+habían
+hube
+hubiste
+hubo
+hubimos
+hubisteis
+hubieron
+hubiera
+hubieras
+hubiéramos
+hubierais
+hubieran
+hubiese
+hubieses
+hubiésemos
+hubieseis
+hubiesen
+habiendo
+habido
+habida
+habidos
+habidas
+
+               | forms of ser, to be (not including the infinitive):
+soy
+eres
+es
+somos
+sois
+son
+sea
+seas
+seamos
+seáis
+sean
+seré
+serás
+será
+seremos
+seréis
+serán
+sería
+serías
+seríamos
+seríais
+serían
+era
+eras
+éramos
+erais
+eran
+fui
+fuiste
+fue
+fuimos
+fuisteis
+fueron
+fuera
+fueras
+fuéramos
+fuerais
+fueran
+fuese
+fueses
+fuésemos
+fueseis
+fuesen
+siendo
+sido
+  |  sed also means 'thirst'
+
+               | forms of tener, to have (not including the infinitive):
+tengo
+tienes
+tiene
+tenemos
+tenéis
+tienen
+tenga
+tengas
+tengamos
+tengáis
+tengan
+tendré
+tendrás
+tendrá
+tendremos
+tendréis
+tendrán
+tendría
+tendrías
+tendríamos
+tendríais
+tendrían
+tenía
+tenías
+teníamos
+teníais
+tenían
+tuve
+tuviste
+tuvo
+tuvimos
+tuvisteis
+tuvieron
+tuviera
+tuvieras
+tuviéramos
+tuvierais
+tuvieran
+tuviese
+tuvieses
+tuviésemos
+tuvieseis
+tuviesen
+teniendo
+tenido
+tenida
+tenidos
+tenidas
+tened
+
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_eu.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_eu.txt
new file mode 100644
index 0000000..25f1db9
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_eu.txt
@@ -0,0 +1,99 @@
+# example set of basque stopwords
+al
+anitz
+arabera
+asko
+baina
+bat
+batean
+batek
+bati
+batzuei
+batzuek
+batzuetan
+batzuk
+bera
+beraiek
+berau
+berauek
+bere
+berori
+beroriek
+beste
+bezala
+da
+dago
+dira
+ditu
+du
+dute
+edo
+egin
+ere
+eta
+eurak
+ez
+gainera
+gu
+gutxi
+guzti
+haiei
+haiek
+haietan
+hainbeste
+hala
+han
+handik
+hango
+hara
+hari
+hark
+hartan
+hau
+hauei
+hauek
+hauetan
+hemen
+hemendik
+hemengo
+hi
+hona
+honek
+honela
+honetan
+honi
+hor
+hori
+horiei
+horiek
+horietan
+horko
+horra
+horrek
+horrela
+horretan
+horri
+hortik
+hura
+izan
+ni
+noiz
+nola
+non
+nondik
+nongo
+nor
+nora
+ze
+zein
+zen
+zenbait
+zenbat
+zer
+zergatik
+ziren
+zituen
+zu
+zuek
+zuen
+zuten
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_fa.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_fa.txt
new file mode 100644
index 0000000..723641c
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_fa.txt
@@ -0,0 +1,313 @@
+# This file was created by Jacques Savoy and is distributed under the BSD license.
+# See http://members.unine.ch/jacques.savoy/clef/index.html.
+# Also see http://www.opensource.org/licenses/bsd-license.html
+# Note: by default this file is used after normalization, so when adding entries
+# to this file, use the arabic 'ي' instead of 'ی'
+انان
+نداشته
+سراسر
+خياه
+ايشان
+وي
+تاكنون
+بيشتري
+دوم
+پس
+ناشي
+وگو
+يا
+داشتند
+سپس
+هنگام
+هرگز
+پنج
+نشان
+امسال
+ديگر
+گروهي
+شدند
+چطور
+ده
+و
+دو
+نخستين
+ولي
+چرا
+چه
+وسط
+ه
+كدام
+قابل
+يك
+رفت
+هفت
+همچنين
+در
+هزار
+بله
+بلي
+شايد
+اما
+شناسي
+گرفته
+دهد
+داشته
+دانست
+داشتن
+خواهيم
+ميليارد
+وقتيكه
+امد
+خواهد
+جز
+اورده
+شده
+بلكه
+خدمات
+شدن
+برخي
+نبود
+بسياري
+جلوگيري
+حق
+كردند
+نوعي
+بعري
+نكرده
+نظير
+نبايد
+بوده
+بودن
+داد
+اورد
+هست
+جايي
+شود
+دنبال
+داده
+بايد
+سابق
+هيچ
+همان
+انجا
+كمتر
+كجاست
+گردد
+كسي
+تر
+مردم
+تان
+دادن
+بودند
+سري
+جدا
+ندارند
+مگر
+يكديگر
+دارد
+دهند
+بنابراين
+هنگامي
+سمت
+جا
+انچه
+خود
+دادند
+زياد
+دارند
+اثر
+بدون
+بهترين
+بيشتر
+البته
+به
+براساس
+بيرون
+كرد
+بعضي
+گرفت
+توي
+اي
+ميليون
+او
+جريان
+تول
+بر
+مانند
+برابر
+باشيم
+مدتي
+گويند
+اكنون
+تا
+تنها
+جديد
+چند
+بي
+نشده
+كردن
+كردم
+گويد
+كرده
+كنيم
+نمي
+نزد
+روي
+قصد
+فقط
+بالاي
+ديگران
+اين
+ديروز
+توسط
+سوم
+ايم
+دانند
+سوي
+استفاده
+شما
+كنار
+داريم
+ساخته
+طور
+امده
+رفته
+نخست
+بيست
+نزديك
+طي
+كنيد
+از
+انها
+تمامي
+داشت
+يكي
+طريق
+اش
+چيست
+روب
+نمايد
+گفت
+چندين
+چيزي
+تواند
+ام
+ايا
+با
+ان
+ايد
+ترين
+اينكه
+ديگري
+راه
+هايي
+بروز
+همچنان
+پاعين
+كس
+حدود
+مختلف
+مقابل
+چيز
+گيرد
+ندارد
+ضد
+همچون
+سازي
+شان
+مورد
+باره
+مرسي
+خويش
+برخوردار
+چون
+خارج
+شش
+هنوز
+تحت
+ضمن
+هستيم
+گفته
+فكر
+بسيار
+پيش
+براي
+روزهاي
+انكه
+نخواهد
+بالا
+كل
+وقتي
+كي
+چنين
+كه
+گيري
+نيست
+است
+كجا
+كند
+نيز
+يابد
+بندي
+حتي
+توانند
+عقب
+خواست
+كنند
+بين
+تمام
+همه
+ما
+باشند
+مثل
+شد
+اري
+باشد
+اره
+طبق
+بعد
+اگر
+صورت
+غير
+جاي
+بيش
+ريزي
+اند
+زيرا
+چگونه
+بار
+لطفا
+مي
+درباره
+من
+ديده
+همين
+گذاري
+برداري
+علت
+گذاشته
+هم
+فوق
+نه
+ها
+شوند
+اباد
+همواره
+هر
+اول
+خواهند
+چهار
+نام
+امروز
+مان
+هاي
+قبل
+كنم
+سعي
+تازه
+را
+هستند
+زير
+جلوي
+عنوان
+بود
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_fi.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_fi.txt
new file mode 100644
index 0000000..addad79
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_fi.txt
@@ -0,0 +1,95 @@
+ | From svn.tartarus.org/snowball/trunk/website/algorithms/finnish/stop.txt
+ | This file is distributed under the BSD License.
+ | See http://snowball.tartarus.org/license.php
+ | Also see http://www.opensource.org/licenses/bsd-license.html
+ |  - Encoding was converted to UTF-8.
+ |  - This notice was added.
+ 
+| forms of BE
+
+olla
+olen
+olet
+on
+olemme
+olette
+ovat
+ole        | negative form
+
+oli
+olisi
+olisit
+olisin
+olisimme
+olisitte
+olisivat
+olit
+olin
+olimme
+olitte
+olivat
+ollut
+olleet
+
+en         | negation
+et
+ei
+emme
+ette
+eivät
+
+|Nom   Gen    Acc    Part   Iness   Elat    Illat  Adess   Ablat   Allat   Ess    Trans
+minä   minun  minut  minua  minussa minusta minuun minulla minulta minulle               | I
+sinä   sinun  sinut  sinua  sinussa sinusta sinuun sinulla sinulta sinulle               | you
+hän    hänen  hänet  häntä  hänessä hänestä häneen hänellä häneltä hänelle               | he she
+me     meidän meidät meitä  meissä  meistä  meihin meillä  meiltä  meille                | we
+te     teidän teidät teitä  teissä  teistä  teihin teillä  teiltä  teille                | you
+he     heidän heidät heitä  heissä  heistä  heihin heillä  heiltä  heille                | they
+
+tämä   tämän         tätä   tässä   tästä   tähän  tallä   tältä   tälle   tänä   täksi  | this
+tuo    tuon          tuotä  tuossa  tuosta  tuohon tuolla  tuolta  tuolle  tuona  tuoksi | that
+se     sen           sitä   siinä   siitä   siihen sillä   siltä   sille   sinä   siksi  | it
+nämä   näiden        näitä  näissä  näistä  näihin näillä  näiltä  näille  näinä  näiksi | these
+nuo    noiden        noita  noissa  noista  noihin noilla  noilta  noille  noina  noiksi | those
+ne     niiden        niitä  niissä  niistä  niihin niillä  niiltä  niille  niinä  niiksi | they
+
+kuka   kenen kenet   ketä   kenessä kenestä keneen kenellä keneltä kenelle kenenä keneksi| who
+ketkä  keiden ketkä  keitä  keissä  keistä  keihin keillä  keiltä  keille  keinä  keiksi | (pl)
+mikä   minkä minkä   mitä   missä   mistä   mihin  millä   miltä   mille   minä   miksi  | which what
+mitkä                                                                                    | (pl)
+
+joka   jonka         jota   jossa   josta   johon  jolla   jolta   jolle   jona   joksi  | who which
+jotka  joiden        joita  joissa  joista  joihin joilla  joilta  joille  joina  joiksi | (pl)
+
+| conjunctions
+
+että   | that
+ja     | and
+jos    | if
+koska  | because
+kuin   | than
+mutta  | but
+niin   | so
+sekä   | and
+sillä  | for
+tai    | or
+vaan   | but
+vai    | or
+vaikka | although
+
+
+| prepositions
+
+kanssa  | with
+mukaan  | according to
+noin    | about
+poikki  | across
+yli     | over, across
+
+| other
+
+kun    | when
+niin   | so
+nyt    | now
+itse   | self
+
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_fr.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_fr.txt
new file mode 100644
index 0000000..c00837e
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_fr.txt
@@ -0,0 +1,183 @@
+ | From svn.tartarus.org/snowball/trunk/website/algorithms/french/stop.txt
+ | This file is distributed under the BSD License.
+ | See http://snowball.tartarus.org/license.php
+ | Also see http://www.opensource.org/licenses/bsd-license.html
+ |  - Encoding was converted to UTF-8.
+ |  - This notice was added.
+
+ | A French stop word list. Comments begin with vertical bar. Each stop
+ | word is at the start of a line.
+
+au             |  a + le
+aux            |  a + les
+avec           |  with
+ce             |  this
+ces            |  these
+dans           |  with
+de             |  of
+des            |  de + les
+du             |  de + le
+elle           |  she
+en             |  `of them' etc
+et             |  and
+eux            |  them
+il             |  he
+je             |  I
+la             |  the
+le             |  the
+leur           |  their
+lui            |  him
+ma             |  my (fem)
+mais           |  but
+me             |  me
+même           |  same; as in moi-même (myself) etc
+mes            |  me (pl)
+moi            |  me
+mon            |  my (masc)
+ne             |  not
+nos            |  our (pl)
+notre          |  our
+nous           |  we
+on             |  one
+ou             |  where
+par            |  by
+pas            |  not
+pour           |  for
+qu             |  que before vowel
+que            |  that
+qui            |  who
+sa             |  his, her (fem)
+se             |  oneself
+ses            |  his (pl)
+son            |  his, her (masc)
+sur            |  on
+ta             |  thy (fem)
+te             |  thee
+tes            |  thy (pl)
+toi            |  thee
+ton            |  thy (masc)
+tu             |  thou
+un             |  a
+une            |  a
+vos            |  your (pl)
+votre          |  your
+vous           |  you
+
+               |  single letter forms
+
+c              |  c'
+d              |  d'
+j              |  j'
+l              |  l'
+à              |  to, at
+m              |  m'
+n              |  n'
+s              |  s'
+t              |  t'
+y              |  there
+
+               | forms of être (not including the infinitive):
+été
+étée
+étées
+étés
+étant
+suis
+es
+est
+sommes
+êtes
+sont
+serai
+seras
+sera
+serons
+serez
+seront
+serais
+serait
+serions
+seriez
+seraient
+étais
+était
+étions
+étiez
+étaient
+fus
+fut
+fûmes
+fûtes
+furent
+sois
+soit
+soyons
+soyez
+soient
+fusse
+fusses
+fût
+fussions
+fussiez
+fussent
+
+               | forms of avoir (not including the infinitive):
+ayant
+eu
+eue
+eues
+eus
+ai
+as
+avons
+avez
+ont
+aurai
+auras
+aura
+aurons
+aurez
+auront
+aurais
+aurait
+aurions
+auriez
+auraient
+avais
+avait
+avions
+aviez
+avaient
+eut
+eûmes
+eûtes
+eurent
+aie
+aies
+ait
+ayons
+ayez
+aient
+eusse
+eusses
+eût
+eussions
+eussiez
+eussent
+
+               | Later additions (from Jean-Christophe Deschamps)
+ceci           |  this
+celà           |  that
+cet            |  this
+cette          |  this
+ici            |  here
+ils            |  they
+les            |  the (pl)
+leurs          |  their (pl)
+quel           |  which
+quels          |  which
+quelle         |  which
+quelles        |  which
+sans           |  without
+soi            |  oneself
+
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_ga.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_ga.txt
new file mode 100644
index 0000000..9ff88d7
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_ga.txt
@@ -0,0 +1,110 @@
+
+a
+ach
+ag
+agus
+an
+aon
+ar
+arna
+as
+b'
+ba
+beirt
+bhúr
+caoga
+ceathair
+ceathrar
+chomh
+chtó
+chuig
+chun
+cois
+céad
+cúig
+cúigear
+d'
+daichead
+dar
+de
+deich
+deichniúr
+den
+dhá
+do
+don
+dtí
+dá
+dár
+dó
+faoi
+faoin
+faoina
+faoinár
+fara
+fiche
+gach
+gan
+go
+gur
+haon
+hocht
+i
+iad
+idir
+in
+ina
+ins
+inár
+is
+le
+leis
+lena
+lenár
+m'
+mar
+mo
+mé
+na
+nach
+naoi
+naonúr
+ná
+ní
+níor
+nó
+nócha
+ocht
+ochtar
+os
+roimh
+sa
+seacht
+seachtar
+seachtó
+seasca
+seisear
+siad
+sibh
+sinn
+sna
+sé
+sí
+tar
+thar
+thú
+triúr
+trí
+trína
+trínár
+tríocha
+tú
+um
+ár
+é
+éis
+í
+ó
+ón
+óna
+ónár
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_gl.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_gl.txt
new file mode 100644
index 0000000..d8760b1
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_gl.txt
@@ -0,0 +1,161 @@
+# galican stopwords
+a
+aínda
+alí
+aquel
+aquela
+aquelas
+aqueles
+aquilo
+aquí
+ao
+aos
+as
+así
+á
+ben
+cando
+che
+co
+coa
+comigo
+con
+connosco
+contigo
+convosco
+coas
+cos
+cun
+cuns
+cunha
+cunhas
+da
+dalgunha
+dalgunhas
+dalgún
+dalgúns
+das
+de
+del
+dela
+delas
+deles
+desde
+deste
+do
+dos
+dun
+duns
+dunha
+dunhas
+e
+el
+ela
+elas
+eles
+en
+era
+eran
+esa
+esas
+ese
+eses
+esta
+estar
+estaba
+está
+están
+este
+estes
+estiven
+estou
+eu
+é
+facer
+foi
+foron
+fun
+había
+hai
+iso
+isto
+la
+las
+lle
+lles
+lo
+los
+mais
+me
+meu
+meus
+min
+miña
+miñas
+moi
+na
+nas
+neste
+nin
+no
+non
+nos
+nosa
+nosas
+noso
+nosos
+nós
+nun
+nunha
+nuns
+nunhas
+o
+os
+ou
+ó
+ós
+para
+pero
+pode
+pois
+pola
+polas
+polo
+polos
+por
+que
+se
+senón
+ser
+seu
+seus
+sexa
+sido
+sobre
+súa
+súas
+tamén
+tan
+te
+ten
+teñen
+teño
+ter
+teu
+teus
+ti
+tido
+tiña
+tiven
+túa
+túas
+un
+unha
+unhas
+uns
+vos
+vosa
+vosas
+voso
+vosos
+vós
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_hi.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_hi.txt
new file mode 100644
index 0000000..86286bb
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_hi.txt
@@ -0,0 +1,235 @@
+# Also see http://www.opensource.org/licenses/bsd-license.html
+# See http://members.unine.ch/jacques.savoy/clef/index.html.
+# This file was created by Jacques Savoy and is distributed under the BSD license.
+# Note: by default this file also contains forms normalized by HindiNormalizer 
+# for spelling variation (see section below), such that it can be used whether or 
+# not you enable that feature. When adding additional entries to this list,
+# please add the normalized form as well. 
+अंदर
+अत
+अपना
+अपनी
+अपने
+अभी
+आदि
+आप
+इत्यादि
+इन 
+इनका
+इन्हीं
+इन्हें
+इन्हों
+इस
+इसका
+इसकी
+इसके
+इसमें
+इसी
+इसे
+उन
+उनका
+उनकी
+उनके
+उनको
+उन्हीं
+उन्हें
+उन्हों
+उस
+उसके
+उसी
+उसे
+एक
+एवं
+एस
+ऐसे
+और
+कई
+कर
+करता
+करते
+करना
+करने
+करें
+कहते
+कहा
+का
+काफ़ी
+कि
+कितना
+किन्हें
+किन्हों
+किया
+किर
+किस
+किसी
+किसे
+की
+कुछ
+कुल
+के
+को
+कोई
+कौन
+कौनसा
+गया
+घर
+जब
+जहाँ
+जा
+जितना
+जिन
+जिन्हें
+जिन्हों
+जिस
+जिसे
+जीधर
+जैसा
+जैसे
+जो
+तक
+तब
+तरह
+तिन
+तिन्हें
+तिन्हों
+तिस
+तिसे
+तो
+था
+थी
+थे
+दबारा
+दिया
+दुसरा
+दूसरे
+दो
+द्वारा
+न
+नहीं
+ना
+निहायत
+नीचे
+ने
+पर
+पर  
+पहले
+पूरा
+पे
+फिर
+बनी
+बही
+बहुत
+बाद
+बाला
+बिलकुल
+भी
+भीतर
+मगर
+मानो
+मे
+में
+यदि
+यह
+यहाँ
+यही
+या
+यिह 
+ये
+रखें
+रहा
+रहे
+ऱ्वासा
+लिए
+लिये
+लेकिन
+व
+वर्ग
+वह
+वह 
+वहाँ
+वहीं
+वाले
+वुह 
+वे
+वग़ैरह
+संग
+सकता
+सकते
+सबसे
+सभी
+साथ
+साबुत
+साभ
+सारा
+से
+सो
+ही
+हुआ
+हुई
+हुए
+है
+हैं
+हो
+होता
+होती
+होते
+होना
+होने
+# additional normalized forms of the above
+अपनि
+जेसे
+होति
+सभि
+तिंहों
+इंहों
+दवारा
+इसि
+किंहें
+थि
+उंहों
+ओर
+जिंहें
+वहिं
+अभि
+बनि
+हि
+उंहिं
+उंहें
+हें
+वगेरह
+एसे
+रवासा
+कोन
+निचे
+काफि
+उसि
+पुरा
+भितर
+हे
+बहि
+वहां
+कोइ
+यहां
+जिंहों
+तिंहें
+किसि
+कइ
+यहि
+इंहिं
+जिधर
+इंहें
+अदि
+इतयादि
+हुइ
+कोनसा
+इसकि
+दुसरे
+जहां
+अप
+किंहों
+उनकि
+भि
+वरग
+हुअ
+जेसा
+नहिं
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_hu.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_hu.txt
new file mode 100644
index 0000000..1a96f1d
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_hu.txt
@@ -0,0 +1,209 @@
+ | From svn.tartarus.org/snowball/trunk/website/algorithms/hungarian/stop.txt
+ | This file is distributed under the BSD License.
+ | See http://snowball.tartarus.org/license.php
+ | Also see http://www.opensource.org/licenses/bsd-license.html
+ |  - Encoding was converted to UTF-8.
+ |  - This notice was added.
+ 
+| Hungarian stop word list
+| prepared by Anna Tordai
+
+a
+ahogy
+ahol
+aki
+akik
+akkor
+alatt
+által
+általában
+amely
+amelyek
+amelyekben
+amelyeket
+amelyet
+amelynek
+ami
+amit
+amolyan
+amíg
+amikor
+át
+abban
+ahhoz
+annak
+arra
+arról
+az
+azok
+azon
+azt
+azzal
+azért
+aztán
+azután
+azonban
+bár
+be
+belül
+benne
+cikk
+cikkek
+cikkeket
+csak
+de
+e
+eddig
+egész
+egy
+egyes
+egyetlen
+egyéb
+egyik
+egyre
+ekkor
+el
+elég
+ellen
+elő
+először
+előtt
+első
+én
+éppen
+ebben
+ehhez
+emilyen
+ennek
+erre
+ez
+ezt
+ezek
+ezen
+ezzel
+ezért
+és
+fel
+felé
+hanem
+hiszen
+hogy
+hogyan
+igen
+így
+illetve
+ill.
+ill
+ilyen
+ilyenkor
+ison
+ismét
+itt
+jó
+jól
+jobban
+kell
+kellett
+keresztül
+keressünk
+ki
+kívül
+között
+közül
+legalább
+lehet
+lehetett
+legyen
+lenne
+lenni
+lesz
+lett
+maga
+magát
+majd
+majd
+már
+más
+másik
+meg
+még
+mellett
+mert
+mely
+melyek
+mi
+mit
+míg
+miért
+milyen
+mikor
+minden
+mindent
+mindenki
+mindig
+mint
+mintha
+mivel
+most
+nagy
+nagyobb
+nagyon
+ne
+néha
+nekem
+neki
+nem
+néhány
+nélkül
+nincs
+olyan
+ott
+össze
+ő
+ők
+őket
+pedig
+persze
+rá
+s
+saját
+sem
+semmi
+sok
+sokat
+sokkal
+számára
+szemben
+szerint
+szinte
+talán
+tehát
+teljes
+tovább
+továbbá
+több
+úgy
+ugyanis
+új
+újabb
+újra
+után
+utána
+utolsó
+vagy
+vagyis
+valaki
+valami
+valamint
+való
+vagyok
+van
+vannak
+volt
+voltam
+voltak
+voltunk
+vissza
+vele
+viszont
+volna
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_hy.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_hy.txt
new file mode 100644
index 0000000..60c1c50
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_hy.txt
@@ -0,0 +1,46 @@
+# example set of Armenian stopwords.
+այդ
+այլ
+այն
+այս
+դու
+դուք
+եմ
+են
+ենք
+ես
+եք
+է
+էի
+էին
+էինք
+էիր
+էիք
+էր
+ըստ
+թ
+ի
+ին
+իսկ
+իր
+կամ
+համար
+հետ
+հետո
+մենք
+մեջ
+մի
+ն
+նա
+նաև
+նրա
+նրանք
+որ
+որը
+որոնք
+որպես
+ու
+ում
+պիտի
+վրա
+և
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_id.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_id.txt
new file mode 100644
index 0000000..4617f83
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_id.txt
@@ -0,0 +1,359 @@
+# from appendix D of: A Study of Stemming Effects on Information
+# Retrieval in Bahasa Indonesia
+ada
+adanya
+adalah
+adapun
+agak
+agaknya
+agar
+akan
+akankah
+akhirnya
+aku
+akulah
+amat
+amatlah
+anda
+andalah
+antar
+diantaranya
+antara
+antaranya
+diantara
+apa
+apaan
+mengapa
+apabila
+apakah
+apalagi
+apatah
+atau
+ataukah
+ataupun
+bagai
+bagaikan
+sebagai
+sebagainya
+bagaimana
+bagaimanapun
+sebagaimana
+bagaimanakah
+bagi
+bahkan
+bahwa
+bahwasanya
+sebaliknya
+banyak
+sebanyak
+beberapa
+seberapa
+begini
+beginian
+beginikah
+beginilah
+sebegini
+begitu
+begitukah
+begitulah
+begitupun
+sebegitu
+belum
+belumlah
+sebelum
+sebelumnya
+sebenarnya
+berapa
+berapakah
+berapalah
+berapapun
+betulkah
+sebetulnya
+biasa
+biasanya
+bila
+bilakah
+bisa
+bisakah
+sebisanya
+boleh
+bolehkah
+bolehlah
+buat
+bukan
+bukankah
+bukanlah
+bukannya
+cuma
+percuma
+dahulu
+dalam
+dan
+dapat
+dari
+daripada
+dekat
+demi
+demikian
+demikianlah
+sedemikian
+dengan
+depan
+di
+dia
+dialah
+dini
+diri
+dirinya
+terdiri
+dong
+dulu
+enggak
+enggaknya
+entah
+entahlah
+terhadap
+terhadapnya
+hal
+hampir
+hanya
+hanyalah
+harus
+haruslah
+harusnya
+seharusnya
+hendak
+hendaklah
+hendaknya
+hingga
+sehingga
+ia
+ialah
+ibarat
+ingin
+inginkah
+inginkan
+ini
+inikah
+inilah
+itu
+itukah
+itulah
+jangan
+jangankan
+janganlah
+jika
+jikalau
+juga
+justru
+kala
+kalau
+kalaulah
+kalaupun
+kalian
+kami
+kamilah
+kamu
+kamulah
+kan
+kapan
+kapankah
+kapanpun
+dikarenakan
+karena
+karenanya
+ke
+kecil
+kemudian
+kenapa
+kepada
+kepadanya
+ketika
+seketika
+khususnya
+kini
+kinilah
+kiranya
+sekiranya
+kita
+kitalah
+kok
+lagi
+lagian
+selagi
+lah
+lain
+lainnya
+melainkan
+selaku
+lalu
+melalui
+terlalu
+lama
+lamanya
+selama
+selama
+selamanya
+lebih
+terlebih
+bermacam
+macam
+semacam
+maka
+makanya
+makin
+malah
+malahan
+mampu
+mampukah
+mana
+manakala
+manalagi
+masih
+masihkah
+semasih
+masing
+mau
+maupun
+semaunya
+memang
+mereka
+merekalah
+meski
+meskipun
+semula
+mungkin
+mungkinkah
+nah
+namun
+nanti
+nantinya
+nyaris
+oleh
+olehnya
+seorang
+seseorang
+pada
+padanya
+padahal
+paling
+sepanjang
+pantas
+sepantasnya
+sepantasnyalah
+para
+pasti
+pastilah
+per
+pernah
+pula
+pun
+merupakan
+rupanya
+serupa
+saat
+saatnya
+sesaat
+saja
+sajalah
+saling
+bersama
+sama
+sesama
+sambil
+sampai
+sana
+sangat
+sangatlah
+saya
+sayalah
+se
+sebab
+sebabnya
+sebuah
+tersebut
+tersebutlah
+sedang
+sedangkan
+sedikit
+sedikitnya
+segala
+segalanya
+segera
+sesegera
+sejak
+sejenak
+sekali
+sekalian
+sekalipun
+sesekali
+sekaligus
+sekarang
+sekarang
+sekitar
+sekitarnya
+sela
+selain
+selalu
+seluruh
+seluruhnya
+semakin
+sementara
+sempat
+semua
+semuanya
+sendiri
+sendirinya
+seolah
+seperti
+sepertinya
+sering
+seringnya
+serta
+siapa
+siapakah
+siapapun
+disini
+disinilah
+sini
+sinilah
+sesuatu
+sesuatunya
+suatu
+sesudah
+sesudahnya
+sudah
+sudahkah
+sudahlah
+supaya
+tadi
+tadinya
+tak
+tanpa
+setelah
+telah
+tentang
+tentu
+tentulah
+tentunya
+tertentu
+seterusnya
+tapi
+tetapi
+setiap
+tiap
+setidaknya
+tidak
+tidakkah
+tidaklah
+toh
+waduh
+wah
+wahai
+sewaktu
+walau
+walaupun
+wong
+yaitu
+yakni
+yang
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_it.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_it.txt
new file mode 100644
index 0000000..4cb5b08
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_it.txt
@@ -0,0 +1,301 @@
+ | From svn.tartarus.org/snowball/trunk/website/algorithms/italian/stop.txt
+ | This file is distributed under the BSD License.
+ | See http://snowball.tartarus.org/license.php
+ | Also see http://www.opensource.org/licenses/bsd-license.html
+ |  - Encoding was converted to UTF-8.
+ |  - This notice was added.
+
+ | An Italian stop word list. Comments begin with vertical bar. Each stop
+ | word is at the start of a line.
+
+ad             |  a (to) before vowel
+al             |  a + il
+allo           |  a + lo
+ai             |  a + i
+agli           |  a + gli
+all            |  a + l'
+agl            |  a + gl'
+alla           |  a + la
+alle           |  a + le
+con            |  with
+col            |  con + il
+coi            |  con + i (forms collo, cogli etc are now very rare)
+da             |  from
+dal            |  da + il
+dallo          |  da + lo
+dai            |  da + i
+dagli          |  da + gli
+dall           |  da + l'
+dagl           |  da + gll'
+dalla          |  da + la
+dalle          |  da + le
+di             |  of
+del            |  di + il
+dello          |  di + lo
+dei            |  di + i
+degli          |  di + gli
+dell           |  di + l'
+degl           |  di + gl'
+della          |  di + la
+delle          |  di + le
+in             |  in
+nel            |  in + el
+nello          |  in + lo
+nei            |  in + i
+negli          |  in + gli
+nell           |  in + l'
+negl           |  in + gl'
+nella          |  in + la
+nelle          |  in + le
+su             |  on
+sul            |  su + il
+sullo          |  su + lo
+sui            |  su + i
+sugli          |  su + gli
+sull           |  su + l'
+sugl           |  su + gl'
+sulla          |  su + la
+sulle          |  su + le
+per            |  through, by
+tra            |  among
+contro         |  against
+io             |  I
+tu             |  thou
+lui            |  he
+lei            |  she
+noi            |  we
+voi            |  you
+loro           |  they
+mio            |  my
+mia            |
+miei           |
+mie            |
+tuo            |
+tua            |
+tuoi           |  thy
+tue            |
+suo            |
+sua            |
+suoi           |  his, her
+sue            |
+nostro         |  our
+nostra         |
+nostri         |
+nostre         |
+vostro         |  your
+vostra         |
+vostri         |
+vostre         |
+mi             |  me
+ti             |  thee
+ci             |  us, there
+vi             |  you, there
+lo             |  him, the
+la             |  her, the
+li             |  them
+le             |  them, the
+gli            |  to him, the
+ne             |  from there etc
+il             |  the
+un             |  a
+uno            |  a
+una            |  a
+ma             |  but
+ed             |  and
+se             |  if
+perché         |  why, because
+anche          |  also
+come           |  how
+dov            |  where (as dov')
+dove           |  where
+che            |  who, that
+chi            |  who
+cui            |  whom
+non            |  not
+più            |  more
+quale          |  who, that
+quanto         |  how much
+quanti         |
+quanta         |
+quante         |
+quello         |  that
+quelli         |
+quella         |
+quelle         |
+questo         |  this
+questi         |
+questa         |
+queste         |
+si             |  yes
+tutto          |  all
+tutti          |  all
+
+               |  single letter forms:
+
+a              |  at
+c              |  as c' for ce or ci
+e              |  and
+i              |  the
+l              |  as l'
+o              |  or
+
+               | forms of avere, to have (not including the infinitive):
+
+ho
+hai
+ha
+abbiamo
+avete
+hanno
+abbia
+abbiate
+abbiano
+avrò
+avrai
+avrà
+avremo
+avrete
+avranno
+avrei
+avresti
+avrebbe
+avremmo
+avreste
+avrebbero
+avevo
+avevi
+aveva
+avevamo
+avevate
+avevano
+ebbi
+avesti
+ebbe
+avemmo
+aveste
+ebbero
+avessi
+avesse
+avessimo
+avessero
+avendo
+avuto
+avuta
+avuti
+avute
+
+               | forms of essere, to be (not including the infinitive):
+sono
+sei
+è
+siamo
+siete
+sia
+siate
+siano
+sarò
+sarai
+sarà
+saremo
+sarete
+saranno
+sarei
+saresti
+sarebbe
+saremmo
+sareste
+sarebbero
+ero
+eri
+era
+eravamo
+eravate
+erano
+fui
+fosti
+fu
+fummo
+foste
+furono
+fossi
+fosse
+fossimo
+fossero
+essendo
+
+               | forms of fare, to do (not including the infinitive, fa, fat-):
+faccio
+fai
+facciamo
+fanno
+faccia
+facciate
+facciano
+farò
+farai
+farà
+faremo
+farete
+faranno
+farei
+faresti
+farebbe
+faremmo
+fareste
+farebbero
+facevo
+facevi
+faceva
+facevamo
+facevate
+facevano
+feci
+facesti
+fece
+facemmo
+faceste
+fecero
+facessi
+facesse
+facessimo
+facessero
+facendo
+
+               | forms of stare, to be (not including the infinitive):
+sto
+stai
+sta
+stiamo
+stanno
+stia
+stiate
+stiano
+starò
+starai
+starà
+staremo
+starete
+staranno
+starei
+staresti
+starebbe
+staremmo
+stareste
+starebbero
+stavo
+stavi
+stava
+stavamo
+stavate
+stavano
+stetti
+stesti
+stette
+stemmo
+steste
+stettero
+stessi
+stesse
+stessimo
+stessero
+stando
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_ja.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_ja.txt
new file mode 100644
index 0000000..d4321be
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_ja.txt
@@ -0,0 +1,127 @@
+#
+# This file defines a stopword set for Japanese.
+#
+# This set is made up of hand-picked frequent terms from segmented Japanese Wikipedia.
+# Punctuation characters and frequent kanji have mostly been left out.  See LUCENE-3745
+# for frequency lists, etc. that can be useful for making your own set (if desired)
+#
+# Note that there is an overlap between these stopwords and the terms stopped when used
+# in combination with the JapanesePartOfSpeechStopFilter.  When editing this file, note
+# that comments are not allowed on the same line as stopwords.
+#
+# Also note that stopping is done in a case-insensitive manner.  Change your StopFilter
+# configuration if you need case-sensitive stopping.  Lastly, note that stopping is done
+# using the same character width as the entries in this file.  Since this StopFilter is
+# normally done after a CJKWidthFilter in your chain, you would usually want your romaji
+# entries to be in half-width and your kana entries to be in full-width.
+#
+の
+に
+は
+を
+た
+が
+で
+て
+と
+し
+れ
+さ
+ある
+いる
+も
+する
+から
+な
+こと
+として
+い
+や
+れる
+など
+なっ
+ない
+この
+ため
+その
+あっ
+よう
+また
+もの
+という
+あり
+まで
+られ
+なる
+へ
+か
+だ
+これ
+によって
+により
+おり
+より
+による
+ず
+なり
+られる
+において
+ば
+なかっ
+なく
+しかし
+について
+せ
+だっ
+その後
+できる
+それ
+う
+ので
+なお
+のみ
+でき
+き
+つ
+における
+および
+いう
+さらに
+でも
+ら
+たり
+その他
+に関する
+たち
+ます
+ん
+なら
+に対して
+特に
+せる
+及び
+これら
+とき
+では
+にて
+ほか
+ながら
+うち
+そして
+とともに
+ただし
+かつて
+それぞれ
+または
+お
+ほど
+ものの
+に対する
+ほとんど
+と共に
+といった
+です
+とも
+ところ
+ここ
+##### End of file
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_lv.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_lv.txt
new file mode 100644
index 0000000..e21a23c
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_lv.txt
@@ -0,0 +1,172 @@
+# Set of Latvian stopwords from A Stemming Algorithm for Latvian, Karlis Kreslins
+# the original list of over 800 forms was refined: 
+#   pronouns, adverbs, interjections were removed
+# 
+# prepositions
+aiz
+ap
+ar
+apakš
+ārpus
+augšpus
+bez
+caur
+dēļ
+gar
+iekš
+iz
+kopš
+labad
+lejpus
+līdz
+no
+otrpus
+pa
+par
+pār
+pēc
+pie
+pirms
+pret
+priekš
+starp
+šaipus
+uz
+viņpus
+virs
+virspus
+zem
+apakšpus
+# Conjunctions
+un
+bet
+jo
+ja
+ka
+lai
+tomēr
+tikko
+turpretī
+arī
+kaut
+gan
+tādēļ
+tā
+ne
+tikvien
+vien
+kā
+ir
+te
+vai
+kamēr
+# Particles
+ar
+diezin
+droši
+diemžēl
+nebūt
+ik
+it
+taču
+nu
+pat
+tiklab
+iekšpus
+nedz
+tik
+nevis
+turpretim
+jeb
+iekam
+iekām
+iekāms
+kolīdz
+līdzko
+tiklīdz
+jebšu
+tālab
+tāpēc
+nekā
+itin
+jā
+jau
+jel
+nē
+nezin
+tad
+tikai
+vis
+tak
+iekams
+vien
+# modal verbs
+būt  
+biju 
+biji
+bija
+bijām
+bijāt
+esmu
+esi
+esam
+esat 
+būšu     
+būsi
+būs
+būsim
+būsiet
+tikt
+tiku
+tiki
+tika
+tikām
+tikāt
+tieku
+tiec
+tiek
+tiekam
+tiekat
+tikšu
+tiks
+tiksim
+tiksiet
+tapt
+tapi
+tapāt
+topat
+tapšu
+tapsi
+taps
+tapsim
+tapsiet
+kļūt
+kļuvu
+kļuvi
+kļuva
+kļuvām
+kļuvāt
+kļūstu
+kļūsti
+kļūst
+kļūstam
+kļūstat
+kļūšu
+kļūsi
+kļūs
+kļūsim
+kļūsiet
+# verbs
+varēt
+varēju
+varējām
+varēšu
+varēsim
+var
+varēji
+varējāt
+varēsi
+varēsiet
+varat
+varēja
+varēs
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_nl.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_nl.txt
new file mode 100644
index 0000000..f4d61f5
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_nl.txt
@@ -0,0 +1,117 @@
+ | From svn.tartarus.org/snowball/trunk/website/algorithms/dutch/stop.txt
+ | This file is distributed under the BSD License.
+ | See http://snowball.tartarus.org/license.php
+ | Also see http://www.opensource.org/licenses/bsd-license.html
+ |  - Encoding was converted to UTF-8.
+ |  - This notice was added.
+
+ | A Dutch stop word list. Comments begin with vertical bar. Each stop
+ | word is at the start of a line.
+
+ | This is a ranked list (commonest to rarest) of stopwords derived from
+ | a large sample of Dutch text.
+
+ | Dutch stop words frequently exhibit homonym clashes. These are indicated
+ | clearly below.
+
+de             |  the
+en             |  and
+van            |  of, from
+ik             |  I, the ego
+te             |  (1) chez, at etc, (2) to, (3) too
+dat            |  that, which
+die            |  that, those, who, which
+in             |  in, inside
+een            |  a, an, one
+hij            |  he
+het            |  the, it
+niet           |  not, nothing, naught
+zijn           |  (1) to be, being, (2) his, one's, its
+is             |  is
+was            |  (1) was, past tense of all persons sing. of 'zijn' (to be) (2) wax, (3) the washing, (4) rise of river
+op             |  on, upon, at, in, up, used up
+aan            |  on, upon, to (as dative)
+met            |  with, by
+als            |  like, such as, when
+voor           |  (1) before, in front of, (2) furrow
+had            |  had, past tense all persons sing. of 'hebben' (have)
+er             |  there
+maar           |  but, only
+om             |  round, about, for etc
+hem            |  him
+dan            |  then
+zou            |  should/would, past tense all persons sing. of 'zullen'
+of             |  or, whether, if
+wat            |  what, something, anything
+mijn           |  possessive and noun 'mine'
+men            |  people, 'one'
+dit            |  this
+zo             |  so, thus, in this way
+door           |  through by
+over           |  over, across
+ze             |  she, her, they, them
+zich           |  oneself
+bij            |  (1) a bee, (2) by, near, at
+ook            |  also, too
+tot            |  till, until
+je             |  you
+mij            |  me
+uit            |  out of, from
+der            |  Old Dutch form of 'van der' still found in surnames
+daar           |  (1) there, (2) because
+haar           |  (1) her, their, them, (2) hair
+naar           |  (1) unpleasant, unwell etc, (2) towards, (3) as
+heb            |  present first person sing. of 'to have'
+hoe            |  how, why
+heeft          |  present third person sing. of 'to have'
+hebben         |  'to have' and various parts thereof
+deze           |  this
+u              |  you
+want           |  (1) for, (2) mitten, (3) rigging
+nog            |  yet, still
+zal            |  'shall', first and third person sing. of verb 'zullen' (will)
+me             |  me
+zij            |  she, they
+nu             |  now
+ge             |  'thou', still used in Belgium and south Netherlands
+geen           |  none
+omdat          |  because
+iets           |  something, somewhat
+worden         |  to become, grow, get
+toch           |  yet, still
+al             |  all, every, each
+waren          |  (1) 'were' (2) to wander, (3) wares, (3)
+veel           |  much, many
+meer           |  (1) more, (2) lake
+doen           |  to do, to make
+toen           |  then, when
+moet           |  noun 'spot/mote' and present form of 'to must'
+ben            |  (1) am, (2) 'are' in interrogative second person singular of 'to be'
+zonder         |  without
+kan            |  noun 'can' and present form of 'to be able'
+hun            |  their, them
+dus            |  so, consequently
+alles          |  all, everything, anything
+onder          |  under, beneath
+ja             |  yes, of course
+eens           |  once, one day
+hier           |  here
+wie            |  who
+werd           |  imperfect third person sing. of 'become'
+altijd         |  always
+doch           |  yet, but etc
+wordt          |  present third person sing. of 'become'
+wezen          |  (1) to be, (2) 'been' as in 'been fishing', (3) orphans
+kunnen         |  to be able
+ons            |  us/our
+zelf           |  self
+tegen          |  against, towards, at
+na             |  after, near
+reeds          |  already
+wil            |  (1) present tense of 'want', (2) 'will', noun, (3) fender
+kon            |  could; past tense of 'to be able'
+niets          |  nothing
+uw             |  your
+iemand         |  somebody
+geweest        |  been; past participle of 'be'
+andere         |  other
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_no.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_no.txt
new file mode 100644
index 0000000..e76f36e
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_no.txt
@@ -0,0 +1,192 @@
+ | From svn.tartarus.org/snowball/trunk/website/algorithms/norwegian/stop.txt
+ | This file is distributed under the BSD License.
+ | See http://snowball.tartarus.org/license.php
+ | Also see http://www.opensource.org/licenses/bsd-license.html
+ |  - Encoding was converted to UTF-8.
+ |  - This notice was added.
+
+ | A Norwegian stop word list. Comments begin with vertical bar. Each stop
+ | word is at the start of a line.
+
+ | This stop word list is for the dominant bokmål dialect. Words unique
+ | to nynorsk are marked *.
+
+ | Revised by Jan Bruusgaard <Jan.Bruusgaard@ssb.no>, Jan 2005
+
+og             | and
+i              | in
+jeg            | I
+det            | it/this/that
+at             | to (w. inf.)
+en             | a/an
+et             | a/an
+den            | it/this/that
+til            | to
+er             | is/am/are
+som            | who/that
+på             | on
+de             | they / you(formal)
+med            | with
+han            | he
+av             | of
+ikke           | not
+ikkje          | not *
+der            | there
+så             | so
+var            | was/were
+meg            | me
+seg            | you
+men            | but
+ett            | one
+har            | have
+om             | about
+vi             | we
+min            | my
+mitt           | my
+ha             | have
+hadde          | had
+hun            | she
+nå             | now
+over           | over
+da             | when/as
+ved            | by/know
+fra            | from
+du             | you
+ut             | out
+sin            | your
+dem            | them
+oss            | us
+opp            | up
+man            | you/one
+kan            | can
+hans           | his
+hvor           | where
+eller          | or
+hva            | what
+skal           | shall/must
+selv           | self (reflective)
+sjøl           | self (reflective)
+her            | here
+alle           | all
+vil            | will
+bli            | become
+ble            | became
+blei           | became *
+blitt          | have become
+kunne          | could
+inn            | in
+når            | when
+være           | be
+kom            | come
+noen           | some
+noe            | some
+ville          | would
+dere           | you
+som            | who/which/that
+deres          | their/theirs
+kun            | only/just
+ja             | yes
+etter          | after
+ned            | down
+skulle         | should
+denne          | this
+for            | for/because
+deg            | you
+si             | hers/his
+sine           | hers/his
+sitt           | hers/his
+mot            | against
+å              | to
+meget          | much
+hvorfor        | why
+dette          | this
+disse          | these/those
+uten           | without
+hvordan        | how
+ingen          | none
+din            | your
+ditt           | your
+blir           | become
+samme          | same
+hvilken        | which
+hvilke         | which (plural)
+sånn           | such a
+inni           | inside/within
+mellom         | between
+vår            | our
+hver           | each
+hvem           | who
+vors           | us/ours
+hvis           | whose
+både           | both
+bare           | only/just
+enn            | than
+fordi          | as/because
+før            | before
+mange          | many
+også           | also
+slik           | just
+vært           | been
+være           | to be
+båe            | both *
+begge          | both
+siden          | since
+dykk           | your *
+dykkar         | yours *
+dei            | they *
+deira          | them *
+deires         | theirs *
+deim           | them *
+di             | your (fem.) *
+då             | as/when *
+eg             | I *
+ein            | a/an *
+eit            | a/an *
+eitt           | a/an *
+elles          | or *
+honom          | he *
+hjå            | at *
+ho             | she *
+hoe            | she *
+henne          | her
+hennar         | her/hers
+hennes         | hers
+hoss           | how *
+hossen         | how *
+ikkje          | not *
+ingi           | noone *
+inkje          | noone *
+korleis        | how *
+korso          | how *
+kva            | what/which *
+kvar           | where *
+kvarhelst      | where *
+kven           | who/whom *
+kvi            | why *
+kvifor         | why *
+me             | we *
+medan          | while *
+mi             | my *
+mine           | my *
+mykje          | much *
+no             | now *
+nokon          | some (masc./neut.) *
+noka           | some (fem.) *
+nokor          | some *
+noko           | some *
+nokre          | some *
+si             | his/hers *
+sia            | since *
+sidan          | since *
+so             | so *
+somt           | some *
+somme          | some *
+um             | about*
+upp            | up *
+vere           | be *
+vore           | was *
+verte          | become *
+vort           | become *
+varte          | became *
+vart           | became *
+
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_pt.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_pt.txt
new file mode 100644
index 0000000..276c1b4
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_pt.txt
@@ -0,0 +1,251 @@
+ | From svn.tartarus.org/snowball/trunk/website/algorithms/portuguese/stop.txt
+ | This file is distributed under the BSD License.
+ | See http://snowball.tartarus.org/license.php
+ | Also see http://www.opensource.org/licenses/bsd-license.html
+ |  - Encoding was converted to UTF-8.
+ |  - This notice was added.
+
+ | A Portuguese stop word list. Comments begin with vertical bar. Each stop
+ | word is at the start of a line.
+
+
+ | The following is a ranked list (commonest to rarest) of stopwords
+ | deriving from a large sample of text.
+
+ | Extra words have been added at the end.
+
+de             |  of, from
+a              |  the; to, at; her
+o              |  the; him
+que            |  who, that
+e              |  and
+do             |  de + o
+da             |  de + a
+em             |  in
+um             |  a
+para           |  for
+  | é          from SER
+com            |  with
+não            |  not, no
+uma            |  a
+os             |  the; them
+no             |  em + o
+se             |  himself etc
+na             |  em + a
+por            |  for
+mais           |  more
+as             |  the; them
+dos            |  de + os
+como           |  as, like
+mas            |  but
+  | foi        from SER
+ao             |  a + o
+ele            |  he
+das            |  de + as
+  | tem        from TER
+à              |  a + a
+seu            |  his
+sua            |  her
+ou             |  or
+  | ser        from SER
+quando         |  when
+muito          |  much
+  | há         from HAV
+nos            |  em + os; us
+já             |  already, now
+  | está       from EST
+eu             |  I
+também         |  also
+só             |  only, just
+pelo           |  per + o
+pela           |  per + a
+até            |  up to
+isso           |  that
+ela            |  he
+entre          |  between
+  | era        from SER
+depois         |  after
+sem            |  without
+mesmo          |  same
+aos            |  a + os
+  | ter        from TER
+seus           |  his
+quem           |  whom
+nas            |  em + as
+me             |  me
+esse           |  that
+eles           |  they
+  | estão      from EST
+você           |  you
+  | tinha      from TER
+  | foram      from SER
+essa           |  that
+num            |  em + um
+nem            |  nor
+suas           |  her
+meu            |  my
+às             |  a + as
+minha          |  my
+  | têm        from TER
+numa           |  em + uma
+pelos          |  per + os
+elas           |  they
+  | havia      from HAV
+  | seja       from SER
+qual           |  which
+  | será       from SER
+nós            |  we
+  | tenho      from TER
+lhe            |  to him, her
+deles          |  of them
+essas          |  those
+esses          |  those
+pelas          |  per + as
+este           |  this
+  | fosse      from SER
+dele           |  of him
+
+ | other words. There are many contractions such as naquele = em+aquele,
+ | mo = me+o, but they are rare.
+ | Indefinite article plural forms are also rare.
+
+tu             |  thou
+te             |  thee
+vocês          |  you (plural)
+vos            |  you
+lhes           |  to them
+meus           |  my
+minhas
+teu            |  thy
+tua
+teus
+tuas
+nosso          | our
+nossa
+nossos
+nossas
+
+dela           |  of her
+delas          |  of them
+
+esta           |  this
+estes          |  these
+estas          |  these
+aquele         |  that
+aquela         |  that
+aqueles        |  those
+aquelas        |  those
+isto           |  this
+aquilo         |  that
+
+               | forms of estar, to be (not including the infinitive):
+estou
+está
+estamos
+estão
+estive
+esteve
+estivemos
+estiveram
+estava
+estávamos
+estavam
+estivera
+estivéramos
+esteja
+estejamos
+estejam
+estivesse
+estivéssemos
+estivessem
+estiver
+estivermos
+estiverem
+
+               | forms of haver, to have (not including the infinitive):
+hei
+há
+havemos
+hão
+houve
+houvemos
+houveram
+houvera
+houvéramos
+haja
+hajamos
+hajam
+houvesse
+houvéssemos
+houvessem
+houver
+houvermos
+houverem
+houverei
+houverá
+houveremos
+houverão
+houveria
+houveríamos
+houveriam
+
+               | forms of ser, to be (not including the infinitive):
+sou
+somos
+são
+era
+éramos
+eram
+fui
+foi
+fomos
+foram
+fora
+fôramos
+seja
+sejamos
+sejam
+fosse
+fôssemos
+fossem
+for
+formos
+forem
+serei
+será
+seremos
+serão
+seria
+seríamos
+seriam
+
+               | forms of ter, to have (not including the infinitive):
+tenho
+tem
+temos
+tém
+tinha
+tínhamos
+tinham
+tive
+teve
+tivemos
+tiveram
+tivera
+tivéramos
+tenha
+tenhamos
+tenham
+tivesse
+tivéssemos
+tivessem
+tiver
+tivermos
+tiverem
+terei
+terá
+teremos
+terão
+teria
+teríamos
+teriam
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_ro.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_ro.txt
new file mode 100644
index 0000000..4fdee90
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_ro.txt
@@ -0,0 +1,233 @@
+# This file was created by Jacques Savoy and is distributed under the BSD license.
+# See http://members.unine.ch/jacques.savoy/clef/index.html.
+# Also see http://www.opensource.org/licenses/bsd-license.html
+acea
+aceasta
+această
+aceea
+acei
+aceia
+acel
+acela
+acele
+acelea
+acest
+acesta
+aceste
+acestea
+aceşti
+aceştia
+acolo
+acum
+ai
+aia
+aibă
+aici
+al
+ăla
+ale
+alea
+ălea
+altceva
+altcineva
+am
+ar
+are
+aş
+aşadar
+asemenea
+asta
+ăsta
+astăzi
+astea
+ăstea
+ăştia
+asupra
+aţi
+au
+avea
+avem
+aveţi
+azi
+bine
+bucur
+bună
+ca
+că
+căci
+când
+care
+cărei
+căror
+cărui
+cât
+câte
+câţi
+către
+câtva
+ce
+cel
+ceva
+chiar
+cînd
+cine
+cineva
+cît
+cîte
+cîţi
+cîtva
+contra
+cu
+cum
+cumva
+curând
+curînd
+da
+dă
+dacă
+dar
+datorită
+de
+deci
+deja
+deoarece
+departe
+deşi
+din
+dinaintea
+dintr
+dintre
+drept
+după
+ea
+ei
+el
+ele
+eram
+este
+eşti
+eu
+face
+fără
+fi
+fie
+fiecare
+fii
+fim
+fiţi
+iar
+ieri
+îi
+îl
+îmi
+împotriva
+în 
+înainte
+înaintea
+încât
+încît
+încotro
+între
+întrucât
+întrucît
+îţi
+la
+lângă
+le
+li
+lîngă
+lor
+lui
+mă
+mâine
+mea
+mei
+mele
+mereu
+meu
+mi
+mine
+mult
+multă
+mulţi
+ne
+nicăieri
+nici
+nimeni
+nişte
+noastră
+noastre
+noi
+noştri
+nostru
+nu
+ori
+oricând
+oricare
+oricât
+orice
+oricînd
+oricine
+oricît
+oricum
+oriunde
+până
+pe
+pentru
+peste
+pînă
+poate
+pot
+prea
+prima
+primul
+prin
+printr
+sa
+să
+săi
+sale
+sau
+său
+se
+şi
+sînt
+sîntem
+sînteţi
+spre
+sub
+sunt
+suntem
+sunteţi
+ta
+tăi
+tale
+tău
+te
+ţi
+ţie
+tine
+toată
+toate
+tot
+toţi
+totuşi
+tu
+un
+una
+unde
+undeva
+unei
+unele
+uneori
+unor
+vă
+vi
+voastră
+voastre
+voi
+voştri
+vostru
+vouă
+vreo
+vreun
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_ru.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_ru.txt
new file mode 100644
index 0000000..6430769
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_ru.txt
@@ -0,0 +1,241 @@
+ | From svn.tartarus.org/snowball/trunk/website/algorithms/russian/stop.txt
+ | This file is distributed under the BSD License.
+ | See http://snowball.tartarus.org/license.php
+ | Also see http://www.opensource.org/licenses/bsd-license.html
+ |  - Encoding was converted to UTF-8.
+ |  - This notice was added.
+
+ | a russian stop word list. comments begin with vertical bar. each stop
+ | word is at the start of a line.
+
+ | this is a ranked list (commonest to rarest) of stopwords derived from
+ | a large text sample.
+
+ | letter `ё' is translated to `е'.
+
+и              | and
+в              | in/into
+во             | alternative form
+не             | not
+что            | what/that
+он             | he
+на             | on/onto
+я              | i
+с              | from
+со             | alternative form
+как            | how
+а              | milder form of `no' (but)
+то             | conjunction and form of `that'
+все            | all
+она            | she
+так            | so, thus
+его            | him
+но             | but
+да             | yes/and
+ты             | thou
+к              | towards, by
+у              | around, chez
+же             | intensifier particle
+вы             | you
+за             | beyond, behind
+бы             | conditional/subj. particle
+по             | up to, along
+только         | only
+ее             | her
+мне            | to me
+было           | it was
+вот            | here is/are, particle
+от             | away from
+меня           | me
+еще            | still, yet, more
+нет            | no, there isnt/arent
+о              | about
+из             | out of
+ему            | to him
+теперь         | now
+когда          | when
+даже           | even
+ну             | so, well
+вдруг          | suddenly
+ли             | interrogative particle
+если           | if
+уже            | already, but homonym of `narrower'
+или            | or
+ни             | neither
+быть           | to be
+был            | he was
+него           | prepositional form of его
+до             | up to
+вас            | you accusative
+нибудь         | indef. suffix preceded by hyphen
+опять          | again
+уж             | already, but homonym of `adder'
+вам            | to you
+сказал         | he said
+ведь           | particle `after all'
+там            | there
+потом          | then
+себя           | oneself
+ничего         | nothing
+ей             | to her
+может          | usually with `быть' as `maybe'
+они            | they
+тут            | here
+где            | where
+есть           | there is/are
+надо           | got to, must
+ней            | prepositional form of  ей
+для            | for
+мы             | we
+тебя           | thee
+их             | them, their
+чем            | than
+была           | she was
+сам            | self
+чтоб           | in order to
+без            | without
+будто          | as if
+человек        | man, person, one
+чего           | genitive form of `what'
+раз            | once
+тоже           | also
+себе           | to oneself
+под            | beneath
+жизнь          | life
+будет          | will be
+ж              | short form of intensifer particle `же'
+тогда          | then
+кто            | who
+этот           | this
+говорил        | was saying
+того           | genitive form of `that'
+потому         | for that reason
+этого          | genitive form of `this'
+какой          | which
+совсем         | altogether
+ним            | prepositional form of `его', `они'
+здесь          | here
+этом           | prepositional form of `этот'
+один           | one
+почти          | almost
+мой            | my
+тем            | instrumental/dative plural of `тот', `то'
+чтобы          | full form of `in order that'
+нее            | her (acc.)
+кажется        | it seems
+сейчас         | now
+были           | they were
+куда           | where to
+зачем          | why
+сказать        | to say
+всех           | all (acc., gen. preposn. plural)
+никогда        | never
+сегодня        | today
+можно          | possible, one can
+при            | by
+наконец        | finally
+два            | two
+об             | alternative form of `о', about
+другой         | another
+хоть           | even
+после          | after
+над            | above
+больше         | more
+тот            | that one (masc.)
+через          | across, in
+эти            | these
+нас            | us
+про            | about
+всего          | in all, only, of all
+них            | prepositional form of `они' (they)
+какая          | which, feminine
+много          | lots
+разве          | interrogative particle
+сказала        | she said
+три            | three
+эту            | this, acc. fem. sing.
+моя            | my, feminine
+впрочем        | moreover, besides
+хорошо         | good
+свою           | ones own, acc. fem. sing.
+этой           | oblique form of `эта', fem. `this'
+перед          | in front of
+иногда         | sometimes
+лучше          | better
+чуть           | a little
+том            | preposn. form of `that one'
+нельзя         | one must not
+такой          | such a one
+им             | to them
+более          | more
+всегда         | always
+конечно        | of course
+всю            | acc. fem. sing of `all'
+между          | between
+
+
+  | b: some paradigms
+  |
+  | personal pronouns
+  |
+  | я  меня  мне  мной  [мною]
+  | ты  тебя  тебе  тобой  [тобою]
+  | он  его  ему  им  [него, нему, ним]
+  | она  ее  эи  ею  [нее, нэи, нею]
+  | оно  его  ему  им  [него, нему, ним]
+  |
+  | мы  нас  нам  нами
+  | вы  вас  вам  вами
+  | они  их  им  ими  [них, ним, ними]
+  |
+  |   себя  себе  собой   [собою]
+  |
+  | demonstrative pronouns: этот (this), тот (that)
+  |
+  | этот  эта  это  эти
+  | этого  эты  это  эти
+  | этого  этой  этого  этих
+  | этому  этой  этому  этим
+  | этим  этой  этим  [этою]  этими
+  | этом  этой  этом  этих
+  |
+  | тот  та  то  те
+  | того  ту  то  те
+  | того  той  того  тех
+  | тому  той  тому  тем
+  | тем  той  тем  [тою]  теми
+  | том  той  том  тех
+  |
+  | determinative pronouns
+  |
+  | (a) весь (all)
+  |
+  | весь  вся  все  все
+  | всего  всю  все  все
+  | всего  всей  всего  всех
+  | всему  всей  всему  всем
+  | всем  всей  всем  [всею]  всеми
+  | всем  всей  всем  всех
+  |
+  | (b) сам (himself etc)
+  |
+  | сам  сама  само  сами
+  | самого саму  само  самих
+  | самого самой самого  самих
+  | самому самой самому  самим
+  | самим  самой  самим  [самою]  самими
+  | самом самой самом  самих
+  |
+  | stems of verbs `to be', `to have', `to do' and modal
+  |
+  | быть  бы  буд  быв  есть  суть
+  | име
+  | дел
+  | мог   мож  мочь
+  | уме
+  | хоч  хот
+  | долж
+  | можн
+  | нужн
+  | нельзя
+
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_sv.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_sv.txt
new file mode 100644
index 0000000..22bddfd
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_sv.txt
@@ -0,0 +1,131 @@
+ | From svn.tartarus.org/snowball/trunk/website/algorithms/swedish/stop.txt
+ | This file is distributed under the BSD License.
+ | See http://snowball.tartarus.org/license.php
+ | Also see http://www.opensource.org/licenses/bsd-license.html
+ |  - Encoding was converted to UTF-8.
+ |  - This notice was added.
+
+ | A Swedish stop word list. Comments begin with vertical bar. Each stop
+ | word is at the start of a line.
+
+ | This is a ranked list (commonest to rarest) of stopwords derived from
+ | a large text sample.
+
+ | Swedish stop words occasionally exhibit homonym clashes. For example
+ |  så = so, but also seed. These are indicated clearly below.
+
+och            | and
+det            | it, this/that
+att            | to (with infinitive)
+i              | in, at
+en             | a
+jag            | I
+hon            | she
+som            | who, that
+han            | he
+på             | on
+den            | it, this/that
+med            | with
+var            | where, each
+sig            | him(self) etc
+för            | for
+så             | so (also: seed)
+till           | to
+är             | is
+men            | but
+ett            | a
+om             | if; around, about
+hade           | had
+de             | they, these/those
+av             | of
+icke           | not, no
+mig            | me
+du             | you
+henne          | her
+då             | then, when
+sin            | his
+nu             | now
+har            | have
+inte           | inte någon = no one
+hans           | his
+honom          | him
+skulle         | 'sake'
+hennes         | her
+där            | there
+min            | my
+man            | one (pronoun)
+ej             | nor
+vid            | at, by, on (also: vast)
+kunde          | could
+något          | some etc
+från           | from, off
+ut             | out
+när            | when
+efter          | after, behind
+upp            | up
+vi             | we
+dem            | them
+vara           | be
+vad            | what
+över           | over
+än             | than
+dig            | you
+kan            | can
+sina           | his
+här            | here
+ha             | have
+mot            | towards
+alla           | all
+under          | under (also: wonder)
+någon          | some etc
+eller          | or (else)
+allt           | all
+mycket         | much
+sedan          | since
+ju             | why
+denna          | this/that
+själv          | myself, yourself etc
+detta          | this/that
+åt             | to
+utan           | without
+varit          | was
+hur            | how
+ingen          | no
+mitt           | my
+ni             | you
+bli            | to be, become
+blev           | from bli
+oss            | us
+din            | thy
+dessa          | these/those
+några          | some etc
+deras          | their
+blir           | from bli
+mina           | my
+samma          | (the) same
+vilken         | who, that
+er             | you, your
+sådan          | such a
+vår            | our
+blivit         | from bli
+dess           | its
+inom           | within
+mellan         | between
+sådant         | such a
+varför         | why
+varje          | each
+vilka          | who, that
+ditt           | thy
+vem            | who
+vilket         | who, that
+sitta          | his
+sådana         | such a
+vart           | each
+dina           | thy
+vars           | whose
+vårt           | our
+våra           | our
+ert            | your
+era            | your
+vilkas         | whose
+
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_th.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_th.txt
new file mode 100644
index 0000000..07f0fab
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_th.txt
@@ -0,0 +1,119 @@
+# Thai stopwords from:
+# "Opinion Detection in Thai Political News Columns
+# Based on Subjectivity Analysis"
+# Khampol Sukhum, Supot Nitsuwat, and Choochart Haruechaiyasak
+ไว้
+ไม่
+ไป
+ได้
+ให้
+ใน
+โดย
+แห่ง
+แล้ว
+และ
+แรก
+แบบ
+แต่
+เอง
+เห็น
+เลย
+เริ่ม
+เรา
+เมื่อ
+เพื่อ
+เพราะ
+เป็นการ
+เป็น
+เปิดเผย
+เปิด
+เนื่องจาก
+เดียวกัน
+เดียว
+เช่น
+เฉพาะ
+เคย
+เข้า
+เขา
+อีก
+อาจ
+อะไร
+ออก
+อย่าง
+อยู่
+อยาก
+หาก
+หลาย
+หลังจาก
+หลัง
+หรือ
+หนึ่ง
+ส่วน
+ส่ง
+สุด
+สําหรับ
+ว่า
+วัน
+ลง
+ร่วม
+ราย
+รับ
+ระหว่าง
+รวม
+ยัง
+มี
+มาก
+มา
+พร้อม
+พบ
+ผ่าน
+ผล
+บาง
+น่า
+นี้
+นํา
+นั้น
+นัก
+นอกจาก
+ทุก
+ที่สุด
+ที่
+ทําให้
+ทํา
+ทาง
+ทั้งนี้
+ทั้ง
+ถ้า
+ถูก
+ถึง
+ต้อง
+ต่างๆ
+ต่าง
+ต่อ
+ตาม
+ตั้งแต่
+ตั้ง
+ด้าน
+ด้วย
+ดัง
+ซึ่ง
+ช่วง
+จึง
+จาก
+จัด
+จะ
+คือ
+ความ
+ครั้ง
+คง
+ขึ้น
+ของ
+ขอ
+ขณะ
+ก่อน
+ก็
+การ
+กับ
+กัน
+กว่า
+กล่าว
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_tr.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_tr.txt
new file mode 100644
index 0000000..84d9408
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/stopwords_tr.txt
@@ -0,0 +1,212 @@
+# Turkish stopwords from LUCENE-559
+# merged with the list from "Information Retrieval on Turkish Texts"
+#   (http://www.users.muohio.edu/canf/papers/JASIST2008offPrint.pdf)
+acaba
+altmış
+altı
+ama
+ancak
+arada
+aslında
+ayrıca
+bana
+bazı
+belki
+ben
+benden
+beni
+benim
+beri
+beş
+bile
+bin
+bir
+birçok
+biri
+birkaç
+birkez
+birşey
+birşeyi
+biz
+bize
+bizden
+bizi
+bizim
+böyle
+böylece
+bu
+buna
+bunda
+bundan
+bunlar
+bunları
+bunların
+bunu
+bunun
+burada
+çok
+çünkü
+da
+daha
+dahi
+de
+defa
+değil
+diğer
+diye
+doksan
+dokuz
+dolayı
+dolayısıyla
+dört
+edecek
+eden
+ederek
+edilecek
+ediliyor
+edilmesi
+ediyor
+eğer
+elli
+en
+etmesi
+etti
+ettiği
+ettiğini
+gibi
+göre
+halen
+hangi
+hatta
+hem
+henüz
+hep
+hepsi
+her
+herhangi
+herkesin
+hiç
+hiçbir
+için
+iki
+ile
+ilgili
+ise
+işte
+itibaren
+itibariyle
+kadar
+karşın
+katrilyon
+kendi
+kendilerine
+kendini
+kendisi
+kendisine
+kendisini
+kez
+ki
+kim
+kimden
+kime
+kimi
+kimse
+kırk
+milyar
+milyon
+mu
+mü
+mı
+nasıl
+ne
+neden
+nedenle
+nerde
+nerede
+nereye
+niye
+niçin
+o
+olan
+olarak
+oldu
+olduğu
+olduğunu
+olduklarını
+olmadı
+olmadığı
+olmak
+olması
+olmayan
+olmaz
+olsa
+olsun
+olup
+olur
+olursa
+oluyor
+on
+ona
+ondan
+onlar
+onlardan
+onları
+onların
+onu
+onun
+otuz
+oysa
+öyle
+pek
+rağmen
+sadece
+sanki
+sekiz
+seksen
+sen
+senden
+seni
+senin
+siz
+sizden
+sizi
+sizin
+şey
+şeyden
+şeyi
+şeyler
+şöyle
+şu
+şuna
+şunda
+şundan
+şunları
+şunu
+tarafından
+trilyon
+tüm
+üç
+üzere
+var
+vardı
+ve
+veya
+ya
+yani
+yapacak
+yapılan
+yapılması
+yapıyor
+yapmak
+yaptı
+yaptığı
+yaptığını
+yaptıkları
+yedi
+yerine
+yetmiş
+yine
+yirmi
+yoksa
+yüz
+zaten
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/userdict_ja.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/userdict_ja.txt
new file mode 100644
index 0000000..6f0368e
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/lang/userdict_ja.txt
@@ -0,0 +1,29 @@
+#
+# This is a sample user dictionary for Kuromoji (JapaneseTokenizer)
+#
+# Add entries to this file in order to override the statistical model in terms
+# of segmentation, readings and part-of-speech tags.  Notice that entries do
+# not have weights since they are always used when found.  This is by-design
+# in order to maximize ease-of-use.
+#
+# Entries are defined using the following CSV format:
+#  <text>,<token 1> ... <token n>,<reading 1> ... <reading n>,<part-of-speech tag>
+#
+# Notice that a single half-width space separates tokens and readings, and
+# that the number tokens and readings must match exactly.
+#
+# Also notice that multiple entries with the same <text> is undefined.
+#
+# Whitespace only lines are ignored.  Comments are not allowed on entry lines.
+#
+
+# Custom segmentation for kanji compounds
+日本経済新聞,日本 経済 新聞,ニホン ケイザイ シンブン,カスタム名詞
+関西国際空港,関西 国際 空港,カンサイ コクサイ クウコウ,カスタム名詞
+
+# Custom segmentation for compound katakana
+トートバッグ,トート バッグ,トート バッグ,かずカナ名詞
+ショルダーバッグ,ショルダー バッグ,ショルダー バッグ,かずカナ名詞
+
+# Custom reading for former sumo wrestler
+朝青龍,朝青龍,アサショウリュウ,カスタム人名
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/protwords.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/protwords.txt
new file mode 100644
index 0000000..1dfc0ab
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/protwords.txt
@@ -0,0 +1,21 @@
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+#-----------------------------------------------------------------------
+# Use a protected word file to protect against the stemmer reducing two
+# unrelated words to the same base word.
+
+# Some non-words that normally won't be encountered,
+# just to test that they won't be stemmed.
+dontstems
+zwhacky
+
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/schema.xml b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/schema.xml
new file mode 100644
index 0000000..65991e1
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/schema.xml
@@ -0,0 +1,947 @@
+<?xml version="1.0" encoding="UTF-8" ?>
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements.  See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<!--
+ This is the Solr schema file. This file should be named "schema.xml" and
+ should be in the conf directory under the solr home
+ (i.e. ./solr/conf/schema.xml by default)
+ or located where the classloader for the Solr webapp can find it.
+
+ This example schema is the recommended starting point for users.
+ It should be kept correct and concise, usable out-of-the-box.
+
+ For more information, on how to customize this file, please see
+ http://wiki.apache.org/solr/SchemaXml
+
+ PERFORMANCE NOTE: this schema includes many optional features and should not
+ be used for benchmarking.  To improve performance one could
+  - set stored="false" for all fields possible (esp large fields) when you
+    only need to search on the field but don't need to return the original
+    value.
+  - set indexed="false" if you don't need to search on the field, but only
+    return the field as a result of searching on other indexed fields.
+  - remove all unneeded copyField statements
+  - for best index size and searching performance, set "index" to false
+    for all general text fields, use copyField to copy them to the
+    catchall "text" field, and use that for searching.
+  - For maximum indexing performance, use the StreamingUpdateSolrServer
+    java client.
+  - Remember to run the JVM in server mode, and use a higher logging level
+    that avoids logging every request
+-->
+
+<schema name="example" version="1.5">
+  <!-- attribute "name" is the name of this schema and is only used for display purposes.
+       version="x.y" is Solr's version number for the schema syntax and
+       semantics.  It should not normally be changed by applications.
+
+       1.0: multiValued attribute did not exist, all fields are multiValued
+            by nature
+       1.1: multiValued attribute introduced, false by default
+       1.2: omitTermFreqAndPositions attribute introduced, true by default
+            except for text fields.
+       1.3: removed optional field compress feature
+       1.4: autoGeneratePhraseQueries attribute introduced to drive QueryParser
+            behavior when a single string produces multiple tokens.  Defaults
+            to off for version >= 1.4
+       1.5: omitNorms defaults to true for primitive field types
+            (int, float, boolean, string...)
+     -->
+
+ <fields>
+   <!-- Valid attributes for fields:
+     name: mandatory - the name for the field
+     type: mandatory - the name of a field type from the
+       <types> fieldType section
+     indexed: true if this field should be indexed (searchable or sortable)
+     stored: true if this field should be retrievable
+     multiValued: true if this field may contain multiple values per document
+     omitNorms: (expert) set to true to omit the norms associated with
+       this field (this disables length normalization and index-time
+       boosting for the field, and saves some memory).  Only full-text
+       fields or fields that need an index-time boost need norms.
+       Norms are omitted for primitive (non-analyzed) types by default.
+     termVectors: [false] set to true to store the term vector for a
+       given field.
+       When using MoreLikeThis, fields used for similarity should be
+       stored for best performance.
+     termPositions: Store position information with the term vector.
+       This will increase storage costs.
+     termOffsets: Store offset information with the term vector. This
+       will increase storage costs.
+     required: The field is required.  It will throw an error if the
+       value does not exist
+     default: a value that should be used if no value is specified
+       when adding a document.
+   -->
+
+   <!-- field names should consist of alphanumeric or underscore characters only and
+      not start with a digit.  This is not currently strictly enforced,
+      but other field names will not have first class support from all components
+      and back compatibility is not guaranteed.  Names with both leading and
+      trailing underscores (e.g. _version_) are reserved.
+   -->
+
+   <field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" />
+   <field name="user_friends_count" type="tint" indexed="true" stored="true" />
+   <field name="user_location" type="lowercase" indexed="true" stored="true" />
+   <field name="user_description" type="text_en" indexed="true" stored="false"/>
+   <field name="user_statuses_count" type="tint" indexed="true" stored="true" />
+   <field name="user_followers_count" type="tint" indexed="true" stored="true"/>
+   <field name="user_name" type="text_en" indexed="true" stored="true" />
+   <field name="user_screen_name" type="text_en" indexed="true" stored="true" />
+   <field name="created_at" type="tdate" indexed="true" stored="true" />
+   <field name="text" type="text_en" indexed="true" stored="true" multiValued="true" />
+   <field name="retweet_count" type="tint" indexed="true" stored="true" />
+   <field name="retweeted" type="boolean" indexed="true" stored="false" />
+   <field name="in_reply_to_user_id" type="long" indexed="true" stored="true" />
+   <field name="source" type="lowercase" indexed="true" stored="true" />
+   <field name="in_reply_to_status_id" type="long" indexed="true" stored="true" multiValued="true"/>
+   <field name="media_url_https" type="string" indexed="false" stored="true" />
+   <field name="expanded_url" type="string" indexed="false" stored="true" />
+
+   <!-- file metadata -->
+   <field name="file_download_url" type="string" indexed="false" stored="true" />
+   <field name="file_upload_url" type="string" indexed="false" stored="true" />
+   <field name="file_scheme" type="string" indexed="true" stored="true" />
+   <field name="file_host" type="string" indexed="true" stored="true" />
+   <field name="file_port" type="int" indexed="true" stored="true" />
+   <field name="file_path" type="string" indexed="true" stored="true" />
+   <field name="file_name" type="string" indexed="true" stored="true" />
+   <field name="file_length" type="tlong" indexed="true" stored="true" />
+   <field name="file_last_modified" type="tlong" indexed="true" stored="true" />
+   <field name="file_owner" type="string" indexed="true" stored="true" />
+   <field name="file_group" type="string" indexed="true" stored="true" />
+   <field name="file_permissions_user" type="string" indexed="true" stored="true" />
+   <field name="file_permissions_group" type="string" indexed="true" stored="true" />
+   <field name="file_permissions_other" type="string" indexed="true" stored="true" />
+   <field name="file_permissions_stickybit" type="boolean" indexed="true" stored="true" />
+
+   <!-- tika metadata -->
+   <field name="content_type" type="lowercase" indexed="true" stored="true" />
+
+   <field name="_version_" type="long" indexed="true" stored="true"/>
+   <dynamicField name="ignored_*" type="ignored"/>
+
+ </fields>
+
+
+ <!-- Field to use to determine and enforce document uniqueness.
+      Unless this field is marked with required="false", it will be a required field
+   -->
+ <uniqueKey>id</uniqueKey>
+
+ <!-- DEPRECATED: The defaultSearchField is consulted by various query parsers when
+  parsing a query string that isn't explicit about the field.  Machine (non-user)
+  generated queries are best made explicit, or they can use the "df" request parameter
+  which takes precedence over this.
+  Note: Un-commenting defaultSearchField will be insufficient if your request handler
+  in solrconfig.xml defines "df", which takes precedence. That would need to be removed.
+ <defaultSearchField>text</defaultSearchField> -->
+
+ <!-- DEPRECATED: The defaultOperator (AND|OR) is consulted by various query parsers
+  when parsing a query string to determine if a clause of the query should be marked as
+  required or optional, assuming the clause isn't already marked by some operator.
+  The default is OR, which is generally assumed so it is not a good idea to change it
+  globally here.  The "q.op" request parameter takes precedence over this.
+ <solrQueryParser defaultOperator="OR"/> -->
+
+  <!-- copyField commands copy one field to another at the time a document
+        is added to the index.  It's used either to index the same field differently,
+        or to add multiple fields to the same field for easier/faster searching.  -->
+
+
+  <types>
+    <!-- field type definitions. The "name" attribute is
+       just a label to be used by field definitions.  The "class"
+       attribute and any other attributes determine the real
+       behavior of the fieldType.
+         Class names starting with "solr" refer to java classes in a
+       standard package such as org.apache.solr.analysis
+    -->
+
+    <!-- The StrField type is not analyzed, but indexed/stored verbatim. -->
+    <fieldType name="string" class="solr.StrField" sortMissingLast="true" />
+
+    <!-- boolean type: "true" or "false" -->
+    <fieldType name="boolean" class="solr.BoolField" sortMissingLast="true"/>
+
+    <!-- sortMissingLast and sortMissingFirst attributes are optional attributes are
+         currently supported on types that are sorted internally as strings
+         and on numeric types.
+	     This includes "string","boolean", and, as of 3.5 (and 4.x),
+	     int, float, long, date, double, including the "Trie" variants.
+       - If sortMissingLast="true", then a sort on this field will cause documents
+         without the field to come after documents with the field,
+         regardless of the requested sort order (asc or desc).
+       - If sortMissingFirst="true", then a sort on this field will cause documents
+         without the field to come before documents with the field,
+         regardless of the requested sort order.
+       - If sortMissingLast="false" and sortMissingFirst="false" (the default),
+         then default lucene sorting will be used which places docs without the
+         field first in an ascending sort and last in a descending sort.
+    -->
+
+    <!--
+      Default numeric field types. For faster range queries, consider the tint/tfloat/tlong/tdouble types.
+    -->
+    <fieldType name="int" class="solr.TrieIntField" precisionStep="0" positionIncrementGap="0"/>
+    <fieldType name="float" class="solr.TrieFloatField" precisionStep="0" positionIncrementGap="0"/>
+    <fieldType name="long" class="solr.TrieLongField" precisionStep="0" positionIncrementGap="0"/>
+    <fieldType name="double" class="solr.TrieDoubleField" precisionStep="0" positionIncrementGap="0"/>
+
+    <!--
+     Numeric field types that index each value at various levels of precision
+     to accelerate range queries when the number of values between the range
+     endpoints is large. See the javadoc for NumericRangeQuery for internal
+     implementation details.
+
+     Smaller precisionStep values (specified in bits) will lead to more tokens
+     indexed per value, slightly larger index size, and faster range queries.
+     A precisionStep of 0 disables indexing at different precision levels.
+    -->
+    <fieldType name="tint" class="solr.TrieIntField" precisionStep="8" positionIncrementGap="0"/>
+    <fieldType name="tfloat" class="solr.TrieFloatField" precisionStep="8" positionIncrementGap="0"/>
+    <fieldType name="tlong" class="solr.TrieLongField" precisionStep="8" positionIncrementGap="0"/>
+    <fieldType name="tdouble" class="solr.TrieDoubleField" precisionStep="8" positionIncrementGap="0"/>
+
+    <!-- The format for this date field is of the form 1995-12-31T23:59:59Z, and
+         is a more restricted form of the canonical representation of dateTime
+         http://www.w3.org/TR/xmlschema-2/#dateTime
+         The trailing "Z" designates UTC time and is mandatory.
+         Optional fractional seconds are allowed: 1995-12-31T23:59:59.999Z
+         All other components are mandatory.
+
+         Expressions can also be used to denote calculations that should be
+         performed relative to "NOW" to determine the value, ie...
+
+               NOW/HOUR
+                  ... Round to the start of the current hour
+               NOW-1DAY
+                  ... Exactly 1 day prior to now
+               NOW/DAY+6MONTHS+3DAYS
+                  ... 6 months and 3 days in the future from the start of
+                      the current day
+
+         Consult the DateField javadocs for more information.
+
+         Note: For faster range queries, consider the tdate type
+      -->
+    <fieldType name="date" class="solr.TrieDateField" precisionStep="0" positionIncrementGap="0"/>
+
+    <!-- A Trie based date field for faster date range queries and date faceting. -->
+    <fieldType name="tdate" class="solr.TrieDateField" precisionStep="6" positionIncrementGap="0"/>
+
+
+    <!--Binary data type. The data should be sent/retrieved in as Base64 encoded Strings -->
+    <fieldtype name="binary" class="solr.BinaryField"/>
+
+    <!--
+      Note:
+      These should only be used for compatibility with existing indexes (created with lucene or older Solr versions).
+      Use Trie based fields instead. As of Solr 3.5 and 4.x, Trie based fields support sortMissingFirst/Last
+
+      Plain numeric field types that store and index the text
+      value verbatim (and hence don't correctly support range queries, since the
+      lexicographic ordering isn't equal to the numeric ordering)
+    -->
+    <fieldType name="pint" class="solr.IntField"/>
+    <fieldType name="plong" class="solr.LongField"/>
+    <fieldType name="pfloat" class="solr.FloatField"/>
+    <fieldType name="pdouble" class="solr.DoubleField"/>
+    <fieldType name="pdate" class="solr.DateField" sortMissingLast="true"/>
+
+    <!-- The "RandomSortField" is not used to store or search any
+         data.  You can declare fields of this type it in your schema
+         to generate pseudo-random orderings of your docs for sorting
+         or function purposes.  The ordering is generated based on the field
+         name and the version of the index. As long as the index version
+         remains unchanged, and the same field name is reused,
+         the ordering of the docs will be consistent.
+         If you want different psuedo-random orderings of documents,
+         for the same version of the index, use a dynamicField and
+         change the field name in the request.
+     -->
+    <fieldType name="random" class="solr.RandomSortField" indexed="true" />
+
+    <!-- solr.TextField allows the specification of custom text analyzers
+         specified as a tokenizer and a list of token filters. Different
+         analyzers may be specified for indexing and querying.
+
+         The optional positionIncrementGap puts space between multiple fields of
+         this type on the same document, with the purpose of preventing false phrase
+         matching across fields.
+
+         For more info on customizing your analyzer chain, please see
+         http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters
+     -->
+
+    <!-- One can also specify an existing Analyzer class that has a
+         default constructor via the class attribute on the analyzer element.
+         Example:
+    <fieldType name="text_greek" class="solr.TextField">
+      <analyzer class="org.apache.lucene.analysis.el.GreekAnalyzer"/>
+    </fieldType>
+    -->
+
+    <!-- A text field that only splits on whitespace for exact matching of words -->
+    <fieldType name="text_ws" class="solr.TextField" positionIncrementGap="100">
+      <analyzer>
+        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
+      </analyzer>
+    </fieldType>
+
+    <!-- A general text field that has reasonable, generic
+         cross-language defaults: it tokenizes with StandardTokenizer,
+	 removes stop words from case-insensitive "stopwords.txt"
+	 (empty by default), and down cases.  At query time only, it
+	 also applies synonyms. -->
+    <fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
+      <analyzer type="index">
+        <tokenizer class="solr.StandardTokenizerFactory"/>
+        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
+        <!-- in this example, we will only use synonyms at query time
+        <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
+        -->
+        <filter class="solr.LowerCaseFilterFactory"/>
+      </analyzer>
+      <analyzer type="query">
+        <tokenizer class="solr.StandardTokenizerFactory"/>
+        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
+        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
+        <filter class="solr.LowerCaseFilterFactory"/>
+      </analyzer>
+    </fieldType>
+
+    <!-- A text field with defaults appropriate for English: it
+         tokenizes with StandardTokenizer, removes English stop words
+         (lang/stopwords_en.txt), down cases, protects words from protwords.txt, and
+         finally applies Porter's stemming.  The query time analyzer
+         also applies synonyms from synonyms.txt. -->
+    <fieldType name="text_en" class="solr.TextField" positionIncrementGap="100">
+      <analyzer type="index">
+        <tokenizer class="solr.StandardTokenizerFactory"/>
+        <!-- in this example, we will only use synonyms at query time
+        <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
+        -->
+        <!-- Case insensitive stop word removal.
+          add enablePositionIncrements=true in both the index and query
+          analyzers to leave a 'gap' for more accurate phrase queries.
+        -->
+        <filter class="solr.StopFilterFactory"
+                ignoreCase="true"
+                words="lang/stopwords_en.txt"
+                enablePositionIncrements="true"
+                />
+        <filter class="solr.LowerCaseFilterFactory"/>
+	<filter class="solr.EnglishPossessiveFilterFactory"/>
+        <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
+	<!-- Optionally you may want to use this less aggressive stemmer instead of PorterStemFilterFactory:
+        <filter class="solr.EnglishMinimalStemFilterFactory"/>
+	-->
+        <filter class="solr.PorterStemFilterFactory"/>
+      </analyzer>
+      <analyzer type="query">
+        <tokenizer class="solr.StandardTokenizerFactory"/>
+        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
+        <filter class="solr.StopFilterFactory"
+                ignoreCase="true"
+                words="lang/stopwords_en.txt"
+                enablePositionIncrements="true"
+                />
+        <filter class="solr.LowerCaseFilterFactory"/>
+	<filter class="solr.EnglishPossessiveFilterFactory"/>
+        <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
+	<!-- Optionally you may want to use this less aggressive stemmer instead of PorterStemFilterFactory:
+        <filter class="solr.EnglishMinimalStemFilterFactory"/>
+	-->
+        <filter class="solr.PorterStemFilterFactory"/>
+      </analyzer>
+    </fieldType>
+
+    <!-- A text field with defaults appropriate for English, plus
+	 aggressive word-splitting and autophrase features enabled.
+	 This field is just like text_en, except it adds
+	 WordDelimiterFilter to enable splitting and matching of
+	 words on case-change, alpha numeric boundaries, and
+	 non-alphanumeric chars.  This means certain compound word
+	 cases will work, for example query "wi fi" will match
+	 document "WiFi" or "wi-fi".
+        -->
+    <fieldType name="text_en_splitting" class="solr.TextField" positionIncrementGap="100" autoGeneratePhraseQueries="true">
+      <analyzer type="index">
+        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
+        <!-- in this example, we will only use synonyms at query time
+        <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
+        -->
+        <!-- Case insensitive stop word removal.
+          add enablePositionIncrements=true in both the index and query
+          analyzers to leave a 'gap' for more accurate phrase queries.
+        -->
+        <filter class="solr.StopFilterFactory"
+                ignoreCase="true"
+                words="lang/stopwords_en.txt"
+                enablePositionIncrements="true"
+                />
+        <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
+        <filter class="solr.LowerCaseFilterFactory"/>
+        <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
+        <filter class="solr.PorterStemFilterFactory"/>
+      </analyzer>
+      <analyzer type="query">
+        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
+        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
+        <filter class="solr.StopFilterFactory"
+                ignoreCase="true"
+                words="lang/stopwords_en.txt"
+                enablePositionIncrements="true"
+                />
+        <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
+        <filter class="solr.LowerCaseFilterFactory"/>
+        <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
+        <filter class="solr.PorterStemFilterFactory"/>
+      </analyzer>
+    </fieldType>
+
+    <!-- Less flexible matching, but less false matches.  Probably not ideal for product names,
+         but may be good for SKUs.  Can insert dashes in the wrong place and still match. -->
+    <fieldType name="text_en_splitting_tight" class="solr.TextField" positionIncrementGap="100" autoGeneratePhraseQueries="true">
+      <analyzer>
+        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
+        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="false"/>
+        <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt"/>
+        <filter class="solr.WordDelimiterFilterFactory" generateWordParts="0" generateNumberParts="0" catenateWords="1" catenateNumbers="1" catenateAll="0"/>
+        <filter class="solr.LowerCaseFilterFactory"/>
+        <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
+        <filter class="solr.EnglishMinimalStemFilterFactory"/>
+        <!-- this filter can remove any duplicate tokens that appear at the same position - sometimes
+             possible with WordDelimiterFilter in conjuncton with stemming. -->
+        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
+      </analyzer>
+    </fieldType>
+
+    <!-- Just like text_general except it reverses the characters of
+	 each token, to enable more efficient leading wildcard queries. -->
+    <fieldType name="text_general_rev" class="solr.TextField" positionIncrementGap="100">
+      <analyzer type="index">
+        <tokenizer class="solr.StandardTokenizerFactory"/>
+        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
+        <filter class="solr.LowerCaseFilterFactory"/>
+        <filter class="solr.ReversedWildcardFilterFactory" withOriginal="true"
+           maxPosAsterisk="3" maxPosQuestion="2" maxFractionAsterisk="0.33"/>
+      </analyzer>
+      <analyzer type="query">
+        <tokenizer class="solr.StandardTokenizerFactory"/>
+        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
+        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
+        <filter class="solr.LowerCaseFilterFactory"/>
+      </analyzer>
+    </fieldType>
+
+    <!-- charFilter + WhitespaceTokenizer  -->
+    <!--
+    <fieldType name="text_char_norm" class="solr.TextField" positionIncrementGap="100" >
+      <analyzer>
+        <charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/>
+        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
+      </analyzer>
+    </fieldType>
+    -->
+
+    <!-- This is an example of using the KeywordTokenizer along
+         With various TokenFilterFactories to produce a sortable field
+         that does not include some properties of the source text
+      -->
+    <fieldType name="alphaOnlySort" class="solr.TextField" sortMissingLast="true" omitNorms="true">
+      <analyzer>
+        <!-- KeywordTokenizer does no actual tokenizing, so the entire
+             input string is preserved as a single token
+          -->
+        <tokenizer class="solr.KeywordTokenizerFactory"/>
+        <!-- The LowerCase TokenFilter does what you expect, which can be
+             when you want your sorting to be case insensitive
+          -->
+        <filter class="solr.LowerCaseFilterFactory" />
+        <!-- The TrimFilter removes any leading or trailing whitespace -->
+        <filter class="solr.TrimFilterFactory" />
+        <!-- The PatternReplaceFilter gives you the flexibility to use
+             Java Regular expression to replace any sequence of characters
+             matching a pattern with an arbitrary replacement string,
+             which may include back references to portions of the original
+             string matched by the pattern.
+
+             See the Java Regular Expression documentation for more
+             information on pattern and replacement string syntax.
+
+             http://java.sun.com/j2se/1.6.0/docs/api/java/util/regex/package-summary.html
+          -->
+        <filter class="solr.PatternReplaceFilterFactory"
+                pattern="([^a-z])" replacement="" replace="all"
+        />
+      </analyzer>
+    </fieldType>
+
+    <!--
+    <fieldtype name="phonetic" stored="false" indexed="true" class="solr.TextField" >
+      <analyzer>
+        <tokenizer class="solr.StandardTokenizerFactory"/>
+        <filter class="solr.DoubleMetaphoneFilterFactory" inject="false"/>
+      </analyzer>
+    </fieldtype>
+    -->
+
+    <fieldtype name="payloads" stored="false" indexed="true" class="solr.TextField" >
+      <analyzer>
+        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
+        <!--
+        The DelimitedPayloadTokenFilter can put payloads on tokens... for example,
+        a token of "foo|1.4"  would be indexed as "foo" with a payload of 1.4f
+        Attributes of the DelimitedPayloadTokenFilterFactory :
+         "delimiter" - a one character delimiter. Default is | (pipe)
+	 "encoder" - how to encode the following value into a playload
+	    float -> org.apache.lucene.analysis.payloads.FloatEncoder,
+	    integer -> o.a.l.a.p.IntegerEncoder
+	    identity -> o.a.l.a.p.IdentityEncoder
+            Fully Qualified class name implementing PayloadEncoder, Encoder must have a no arg constructor.
+         -->
+        <filter class="solr.DelimitedPayloadTokenFilterFactory" encoder="float"/>
+      </analyzer>
+    </fieldtype>
+
+    <!-- lowercases the entire field value, keeping it as a single token.  -->
+    <fieldType name="lowercase" class="solr.TextField" positionIncrementGap="100">
+      <analyzer>
+        <tokenizer class="solr.KeywordTokenizerFactory"/>
+        <filter class="solr.LowerCaseFilterFactory" />
+      </analyzer>
+    </fieldType>
+
+    <!--
+      Example of using PathHierarchyTokenizerFactory at index time, so
+      queries for paths match documents at that path, or in descendent paths
+    -->
+    <fieldType name="descendent_path" class="solr.TextField">
+      <analyzer type="index">
+	<tokenizer class="solr.PathHierarchyTokenizerFactory" delimiter="/" />
+      </analyzer>
+      <analyzer type="query">
+	<tokenizer class="solr.KeywordTokenizerFactory" />
+      </analyzer>
+    </fieldType>
+    <!--
+      Example of using PathHierarchyTokenizerFactory at query time, so
+      queries for paths match documents at that path, or in ancestor paths
+    -->
+    <fieldType name="ancestor_path" class="solr.TextField">
+      <analyzer type="index">
+	<tokenizer class="solr.KeywordTokenizerFactory" />
+      </analyzer>
+      <analyzer type="query">
+	<tokenizer class="solr.PathHierarchyTokenizerFactory" delimiter="/" />
+      </analyzer>
+    </fieldType>
+
+    <!-- since fields of this type are by default not stored or indexed,
+         any data added to them will be ignored outright.  -->
+    <fieldtype name="ignored" stored="false" indexed="false" multiValued="true" class="solr.StrField" />
+
+    <!-- This point type indexes the coordinates as separate fields (subFields)
+      If subFieldType is defined, it references a type, and a dynamic field
+      definition is created matching *___<typename>.  Alternately, if
+      subFieldSuffix is defined, that is used to create the subFields.
+      Example: if subFieldType="double", then the coordinates would be
+        indexed in fields myloc_0___double,myloc_1___double.
+      Example: if subFieldSuffix="_d" then the coordinates would be indexed
+        in fields myloc_0_d,myloc_1_d
+      The subFields are an implementation detail of the fieldType, and end
+      users normally should not need to know about them.
+     -->
+    <fieldType name="point" class="solr.PointType" dimension="2" subFieldSuffix="_d"/>
+
+    <!-- A specialized field for geospatial search. If indexed, this fieldType must not be multivalued. -->
+    <fieldType name="location" class="solr.LatLonType" subFieldSuffix="_coordinate"/>
+
+   <!--
+    A Geohash is a compact representation of a latitude longitude pair in a single field.
+    See http://wiki.apache.org/solr/SpatialSearch
+   -->
+    <fieldtype name="geohash" class="solr.GeoHashField"/>
+
+   <!-- Money/currency field type. See http://wiki.apache.org/solr/MoneyFieldType
+        Parameters:
+          defaultCurrency: Specifies the default currency if none specified. Defaults to "USD"
+          precisionStep:   Specifies the precisionStep for the TrieLong field used for the amount
+          providerClass:   Lets you plug in other exchange provider backend:
+                           solr.FileExchangeRateProvider is the default and takes one parameter:
+                             currencyConfig: name of an xml file holding exhange rates
+                           solr.OpenExchangeRatesOrgProvider uses rates from openexchangerates.org:
+                             ratesFileLocation: URL or path to rates JSON file (default latest.json on the web)
+                             refreshInterval: Number of minutes between each rates fetch (default: 1440, min: 60)
+   -->
+    <fieldType name="currency" class="solr.CurrencyField" precisionStep="8" defaultCurrency="USD" currencyConfig="currency.xml" />
+
+
+
+   <!-- some examples for different languages (generally ordered by ISO code) -->
+
+    <!-- Arabic -->
+    <fieldType name="text_ar" class="solr.TextField" positionIncrementGap="100">
+      <analyzer>
+        <tokenizer class="solr.StandardTokenizerFactory"/>
+        <!-- for any non-arabic -->
+        <filter class="solr.LowerCaseFilterFactory"/>
+        <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_ar.txt" enablePositionIncrements="true"/>
+        <!-- normalizes ﻯ to ﻱ, etc -->
+        <filter class="solr.ArabicNormalizationFilterFactory"/>
+        <filter class="solr.ArabicStemFilterFactory"/>
+      </analyzer>
+    </fieldType>
+
+    <!-- Bulgarian -->
+    <fieldType name="text_bg" class="solr.TextField" positionIncrementGap="100">
+      <analyzer>
+        <tokenizer class="solr.StandardTokenizerFactory"/>
+        <filter class="solr.LowerCaseFilterFactory"/>
+        <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_bg.txt" enablePositionIncrements="true"/>
+        <filter class="solr.BulgarianStemFilterFactory"/>
+      </analyzer>
+    </fieldType>
+
+    <!-- Catalan -->
+    <fieldType name="text_ca" class="solr.TextField" positionIncrementGap="100">
+      <analyzer>
+        <tokenizer class="solr.StandardTokenizerFactory"/>
+        <!-- removes l', etc -->
+        <filter class="solr.ElisionFilterFactory" ignoreCase="true" articles="lang/contractions_ca.txt"/>
+        <filter class="solr.LowerCaseFilterFactory"/>
+        <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_ca.txt" enablePositionIncrements="true"/>
+        <filter class="solr.SnowballPorterFilterFactory" language="Catalan"/>
+      </analyzer>
+    </fieldType>
+
+    <!-- CJK bigram (see text_ja for a Japanese configuration using morphological analysis) -->
+    <fieldType name="text_cjk" class="solr.TextField" positionIncrementGap="100">
+      <analyzer>
+        <tokenizer class="solr.StandardTokenizerFactory"/>
+        <!-- normalize width before bigram, as e.g. half-width dakuten combine  -->
+        <filter class="solr.CJKWidthFilterFactory"/>
+        <!-- for any non-CJK -->
+        <filter class="solr.LowerCaseFilterFactory"/>
+        <filter class="solr.CJKBigramFilterFactory"/>
+      </analyzer>
+    </fieldType>
+
+    <!-- Czech -->
+    <fieldType name="text_cz" class="solr.TextField" positionIncrementGap="100">
+      <analyzer>
+        <tokenizer class="solr.StandardTokenizerFactory"/>
+        <filter class="solr.LowerCaseFilterFactory"/>
+        <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_cz.txt" enablePositionIncrements="true"/>
+        <filter class="solr.CzechStemFilterFactory"/>
+      </analyzer>
+    </fieldType>
+
+    <!-- Danish -->
+    <fieldType name="text_da" class="solr.TextField" positionIncrementGap="100">
+      <analyzer>
+        <tokenizer class="solr.StandardTokenizerFactory"/>
+        <filter class="solr.LowerCaseFilterFactory"/>
+        <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_da.txt" format="snowball" enablePositionIncrements="true"/>
+        <filter class="solr.SnowballPorterFilterFactory" language="Danish"/>
+      </analyzer>
+    </fieldType>
+
+    <!-- German -->
+    <fieldType name="text_de" class="solr.TextField" positionIncrementGap="100">
+      <analyzer>
+        <tokenizer class="solr.StandardTokenizerFactory"/>
+        <filter class="solr.LowerCaseFilterFactory"/>
+        <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_de.txt" format="snowball" enablePositionIncrements="true"/>
+        <filter class="solr.GermanNormalizationFilterFactory"/>
+        <filter class="solr.GermanLightStemFilterFactory"/>
+        <!-- less aggressive: <filter class="solr.GermanMinimalStemFilterFactory"/> -->
+        <!-- more aggressive: <filter class="solr.SnowballPorterFilterFactory" language="German2"/> -->
+      </analyzer>
+    </fieldType>
+
+    <!-- Greek -->
+    <fieldType name="text_el" class="solr.TextField" positionIncrementGap="100">
+      <analyzer>
+        <tokenizer class="solr.StandardTokenizerFactory"/>
+        <!-- greek specific lowercase for sigma -->
+        <filter class="solr.GreekLowerCaseFilterFactory"/>
+        <filter class="solr.StopFilterFactory" ignoreCase="false" words="lang/stopwords_el.txt" enablePositionIncrements="true"/>
+        <filter class="solr.GreekStemFilterFactory"/>
+      </analyzer>
+    </fieldType>
+
+    <!-- Spanish -->
+    <fieldType name="text_es" class="solr.TextField" positionIncrementGap="100">
+      <analyzer>
+        <tokenizer class="solr.StandardTokenizerFactory"/>
+        <filter class="solr.LowerCaseFilterFactory"/>
+        <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_es.txt" format="snowball" enablePositionIncrements="true"/>
+        <filter class="solr.SpanishLightStemFilterFactory"/>
+        <!-- more aggressive: <filter class="solr.SnowballPorterFilterFactory" language="Spanish"/> -->
+      </analyzer>
+    </fieldType>
+
+    <!-- Basque -->
+    <fieldType name="text_eu" class="solr.TextField" positionIncrementGap="100">
+      <analyzer>
+        <tokenizer class="solr.StandardTokenizerFactory"/>
+        <filter class="solr.LowerCaseFilterFactory"/>
+        <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_eu.txt" enablePositionIncrements="true"/>
+        <filter class="solr.SnowballPorterFilterFactory" language="Basque"/>
+      </analyzer>
+    </fieldType>
+
+    <!-- Persian -->
+    <fieldType name="text_fa" class="solr.TextField" positionIncrementGap="100">
+      <analyzer>
+        <!-- for ZWNJ -->
+        <charFilter class="solr.PersianCharFilterFactory"/>
+        <tokenizer class="solr.StandardTokenizerFactory"/>
+        <filter class="solr.LowerCaseFilterFactory"/>
+        <filter class="solr.ArabicNormalizationFilterFactory"/>
+        <filter class="solr.PersianNormalizationFilterFactory"/>
+        <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_fa.txt" enablePositionIncrements="true"/>
+      </analyzer>
+    </fieldType>
+
+    <!-- Finnish -->
+    <fieldType name="text_fi" class="solr.TextField" positionIncrementGap="100">
+      <analyzer>
+        <tokenizer class="solr.StandardTokenizerFactory"/>
+        <filter class="solr.LowerCaseFilterFactory"/>
+        <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_fi.txt" format="snowball" enablePositionIncrements="true"/>
+        <filter class="solr.SnowballPorterFilterFactory" language="Finnish"/>
+        <!-- less aggressive: <filter class="solr.FinnishLightStemFilterFactory"/> -->
+      </analyzer>
+    </fieldType>
+
+    <!-- French -->
+    <fieldType name="text_fr" class="solr.TextField" positionIncrementGap="100">
+      <analyzer>
+        <tokenizer class="solr.StandardTokenizerFactory"/>
+        <!-- removes l', etc -->
+        <filter class="solr.ElisionFilterFactory" ignoreCase="true" articles="lang/contractions_fr.txt"/>
+        <filter class="solr.LowerCaseFilterFactory"/>
+        <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_fr.txt" format="snowball" enablePositionIncrements="true"/>
+        <filter class="solr.FrenchLightStemFilterFactory"/>
+        <!-- less aggressive: <filter class="solr.FrenchMinimalStemFilterFactory"/> -->
+        <!-- more aggressive: <filter class="solr.SnowballPorterFilterFactory" language="French"/> -->
+      </analyzer>
+    </fieldType>
+
+    <!-- Irish -->
+    <fieldType name="text_ga" class="solr.TextField" positionIncrementGap="100">
+      <analyzer>
+        <tokenizer class="solr.StandardTokenizerFactory"/>
+        <!-- removes d', etc -->
+        <filter class="solr.ElisionFilterFactory" ignoreCase="true" articles="lang/contractions_ga.txt"/>
+        <!-- removes n-, etc. position increments is intentionally false! -->
+        <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/hyphenations_ga.txt" enablePositionIncrements="false"/>
+        <filter class="solr.IrishLowerCaseFilterFactory"/>
+        <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_ga.txt" enablePositionIncrements="true"/>
+        <filter class="solr.SnowballPorterFilterFactory" language="Irish"/>
+      </analyzer>
+    </fieldType>
+
+    <!-- Galician -->
+    <fieldType name="text_gl" class="solr.TextField" positionIncrementGap="100">
+      <analyzer>
+        <tokenizer class="solr.StandardTokenizerFactory"/>
+        <filter class="solr.LowerCaseFilterFactory"/>
+        <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_gl.txt" enablePositionIncrements="true"/>
+        <filter class="solr.GalicianStemFilterFactory"/>
+        <!-- less aggressive: <filter class="solr.GalicianMinimalStemFilterFactory"/> -->
+      </analyzer>
+    </fieldType>
+
+    <!-- Hindi -->
+    <fieldType name="text_hi" class="solr.TextField" positionIncrementGap="100">
+      <analyzer>
+        <tokenizer class="solr.StandardTokenizerFactory"/>
+        <filter class="solr.LowerCaseFilterFactory"/>
+        <!-- normalizes unicode representation -->
+        <filter class="solr.IndicNormalizationFilterFactory"/>
+        <!-- normalizes variation in spelling -->
+        <filter class="solr.HindiNormalizationFilterFactory"/>
+        <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_hi.txt" enablePositionIncrements="true"/>
+        <filter class="solr.HindiStemFilterFactory"/>
+      </analyzer>
+    </fieldType>
+
+    <!-- Hungarian -->
+    <fieldType name="text_hu" class="solr.TextField" positionIncrementGap="100">
+      <analyzer>
+        <tokenizer class="solr.StandardTokenizerFactory"/>
+        <filter class="solr.LowerCaseFilterFactory"/>
+        <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_hu.txt" format="snowball" enablePositionIncrements="true"/>
+        <filter class="solr.SnowballPorterFilterFactory" language="Hungarian"/>
+        <!-- less aggressive: <filter class="solr.HungarianLightStemFilterFactory"/> -->
+      </analyzer>
+    </fieldType>
+
+    <!-- Armenian -->
+    <fieldType name="text_hy" class="solr.TextField" positionIncrementGap="100">
+      <analyzer>
+        <tokenizer class="solr.StandardTokenizerFactory"/>
+        <filter class="solr.LowerCaseFilterFactory"/>
+        <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_hy.txt" enablePositionIncrements="true"/>
+        <filter class="solr.SnowballPorterFilterFactory" language="Armenian"/>
+      </analyzer>
+    </fieldType>
+
+    <!-- Indonesian -->
+    <fieldType name="text_id" class="solr.TextField" positionIncrementGap="100">
+      <analyzer>
+        <tokenizer class="solr.StandardTokenizerFactory"/>
+        <filter class="solr.LowerCaseFilterFactory"/>
+        <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_id.txt" enablePositionIncrements="true"/>
+        <!-- for a less aggressive approach (only inflectional suffixes), set stemDerivational to false -->
+        <filter class="solr.IndonesianStemFilterFactory" stemDerivational="true"/>
+      </analyzer>
+    </fieldType>
+
+    <!-- Italian -->
+    <fieldType name="text_it" class="solr.TextField" positionIncrementGap="100">
+      <analyzer>
+        <tokenizer class="solr.StandardTokenizerFactory"/>
+        <!-- removes l', etc -->
+        <filter class="solr.ElisionFilterFactory" ignoreCase="true" articles="lang/contractions_it.txt"/>
+        <filter class="solr.LowerCaseFilterFactory"/>
+        <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_it.txt" format="snowball" enablePositionIncrements="true"/>
+        <filter class="solr.ItalianLightStemFilterFactory"/>
+        <!-- more aggressive: <filter class="solr.SnowballPorterFilterFactory" language="Italian"/> -->
+      </analyzer>
+    </fieldType>
+
+    <!-- Latvian -->
+    <fieldType name="text_lv" class="solr.TextField" positionIncrementGap="100">
+      <analyzer>
+        <tokenizer class="solr.StandardTokenizerFactory"/>
+        <filter class="solr.LowerCaseFilterFactory"/>
+        <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_lv.txt" enablePositionIncrements="true"/>
+        <filter class="solr.LatvianStemFilterFactory"/>
+      </analyzer>
+    </fieldType>
+
+    <!-- Dutch -->
+    <fieldType name="text_nl" class="solr.TextField" positionIncrementGap="100">
+      <analyzer>
+        <tokenizer class="solr.StandardTokenizerFactory"/>
+        <filter class="solr.LowerCaseFilterFactory"/>
+        <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_nl.txt" format="snowball" enablePositionIncrements="true"/>
+        <filter class="solr.StemmerOverrideFilterFactory" dictionary="lang/stemdict_nl.txt" ignoreCase="false"/>
+        <filter class="solr.SnowballPorterFilterFactory" language="Dutch"/>
+      </analyzer>
+    </fieldType>
+
+    <!-- Norwegian -->
+    <fieldType name="text_no" class="solr.TextField" positionIncrementGap="100">
+      <analyzer>
+        <tokenizer class="solr.StandardTokenizerFactory"/>
+        <filter class="solr.LowerCaseFilterFactory"/>
+        <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_no.txt" format="snowball" enablePositionIncrements="true"/>
+        <filter class="solr.SnowballPorterFilterFactory" language="Norwegian"/>
+        <!-- less aggressive: <filter class="solr.NorwegianLightStemFilterFactory"/> -->
+        <!-- singular/plural: <filter class="solr.NorwegianMinimalStemFilterFactory"/> -->
+      </analyzer>
+    </fieldType>
+
+    <!-- Portuguese -->
+    <fieldType name="text_pt" class="solr.TextField" positionIncrementGap="100">
+      <analyzer>
+        <tokenizer class="solr.StandardTokenizerFactory"/>
+        <filter class="solr.LowerCaseFilterFactory"/>
+        <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_pt.txt" format="snowball" enablePositionIncrements="true"/>
+        <filter class="solr.PortugueseLightStemFilterFactory"/>
+        <!-- less aggressive: <filter class="solr.PortugueseMinimalStemFilterFactory"/> -->
+        <!-- more aggressive: <filter class="solr.SnowballPorterFilterFactory" language="Portuguese"/> -->
+        <!-- most aggressive: <filter class="solr.PortugueseStemFilterFactory"/> -->
+      </analyzer>
+    </fieldType>
+
+    <!-- Romanian -->
+    <fieldType name="text_ro" class="solr.TextField" positionIncrementGap="100">
+      <analyzer>
+        <tokenizer class="solr.StandardTokenizerFactory"/>
+        <filter class="solr.LowerCaseFilterFactory"/>
+        <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_ro.txt" enablePositionIncrements="true"/>
+        <filter class="solr.SnowballPorterFilterFactory" language="Romanian"/>
+      </analyzer>
+    </fieldType>
+
+    <!-- Russian -->
+    <fieldType name="text_ru" class="solr.TextField" positionIncrementGap="100">
+      <analyzer>
+        <tokenizer class="solr.StandardTokenizerFactory"/>
+        <filter class="solr.LowerCaseFilterFactory"/>
+        <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_ru.txt" format="snowball" enablePositionIncrements="true"/>
+        <filter class="solr.SnowballPorterFilterFactory" language="Russian"/>
+        <!-- less aggressive: <filter class="solr.RussianLightStemFilterFactory"/> -->
+      </analyzer>
+    </fieldType>
+
+    <!-- Swedish -->
+    <fieldType name="text_sv" class="solr.TextField" positionIncrementGap="100">
+      <analyzer>
+        <tokenizer class="solr.StandardTokenizerFactory"/>
+        <filter class="solr.LowerCaseFilterFactory"/>
+        <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_sv.txt" format="snowball" enablePositionIncrements="true"/>
+        <filter class="solr.SnowballPorterFilterFactory" language="Swedish"/>
+        <!-- less aggressive: <filter class="solr.SwedishLightStemFilterFactory"/> -->
+      </analyzer>
+    </fieldType>
+
+    <!-- Thai -->
+    <fieldType name="text_th" class="solr.TextField" positionIncrementGap="100">
+      <analyzer>
+        <tokenizer class="solr.StandardTokenizerFactory"/>
+        <filter class="solr.LowerCaseFilterFactory"/>
+        <filter class="solr.ThaiWordFilterFactory"/>
+        <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_th.txt" enablePositionIncrements="true"/>
+      </analyzer>
+    </fieldType>
+
+    <!-- Turkish -->
+    <fieldType name="text_tr" class="solr.TextField" positionIncrementGap="100">
+      <analyzer>
+        <tokenizer class="solr.StandardTokenizerFactory"/>
+        <filter class="solr.TurkishLowerCaseFilterFactory"/>
+        <filter class="solr.StopFilterFactory" ignoreCase="false" words="lang/stopwords_tr.txt" enablePositionIncrements="true"/>
+        <filter class="solr.SnowballPorterFilterFactory" language="Turkish"/>
+      </analyzer>
+    </fieldType>
+
+ </types>
+
+  <!-- Similarity is the scoring routine for each document vs. a query.
+       A custom Similarity or SimilarityFactory may be specified here, but
+       the default is fine for most applications.
+       For more info: http://wiki.apache.org/solr/SchemaXml#Similarity
+    -->
+  <!--
+     <similarity class="com.example.solr.CustomSimilarityFactory">
+       <str name="paramkey">param value</str>
+     </similarity>
+    -->
+
+</schema>
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/solrconfig.xml b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/solrconfig.xml
new file mode 100644
index 0000000..dc1cfc5
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/solrconfig.xml
@@ -0,0 +1,1828 @@
+<?xml version="1.0" encoding="UTF-8" ?>
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements.  See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<!-- 
+     For more details about configurations options that may appear in
+     this file, see http://wiki.apache.org/solr/SolrConfigXml. 
+-->
+<config>
+  <!-- In all configuration below, a prefix of "solr." for class names
+       is an alias that causes solr to search appropriate packages,
+       including org.apache.solr.(search|update|request|core|analysis)
+
+       You may also specify a fully qualified Java classname if you
+       have your own custom plugins.
+    -->
+
+  <!-- Controls what version of Lucene various components of Solr
+       adhere to.  Generally, you want to use the latest version to
+       get all bug fixes and improvements. It is highly recommended
+       that you fully re-index after changing this setting as it can
+       affect both how text is indexed and queried.
+  -->
+  <luceneMatchVersion>LUCENE_43</luceneMatchVersion>
+
+  <!-- lib directives can be used to instruct Solr to load an Jars
+       identified and use them to resolve any "plugins" specified in
+       your solrconfig.xml or schema.xml (ie: Analyzers, Request
+       Handlers, etc...).
+
+       All directories and paths are resolved relative to the
+       instanceDir.
+
+       If a "./lib" directory exists in your instanceDir, all files
+       found in it are included as if you had used the following
+       syntax...
+       
+              <lib dir="./lib" />
+    -->
+
+  <!-- A 'dir' option by itself adds any files found in the directory 
+       to the classpath, this is useful for including all jars in a
+       directory.
+    -->
+  <!--
+     <lib dir="../add-everything-found-in-this-dir-to-the-classpath" />
+  -->
+  
+  <!-- When a 'regex' is specified in addition to a 'dir', only the
+       files in that directory which completely match the regex
+       (anchored on both ends) will be included.
+    -->
+  <lib dir="../../../contrib/extraction/lib" regex=".*\.jar" />
+  <lib dir="../../../dist/" regex="solr-cell-\d.*\.jar" />
+
+  <lib dir="../../../contrib/clustering/lib/" regex=".*\.jar" />
+  <lib dir="../../../dist/" regex="solr-clustering-\d.*\.jar" />
+
+  <lib dir="../../../contrib/langid/lib/" regex=".*\.jar" />
+  <lib dir="../../../dist/" regex="solr-langid-\d.*\.jar" />
+
+  <lib dir="../../../contrib/velocity/lib" regex=".*\.jar" />
+  <lib dir="../../../dist/" regex="solr-velocity-\d.*\.jar" />
+
+  <!-- If a 'dir' option (with or without a regex) is used and nothing
+       is found that matches, it will be ignored
+    -->
+  <lib dir="/non/existent/dir/yields/warning" /> 
+
+  <!-- an exact 'path' can be used instead of a 'dir' to specify a 
+       specific file.  This will cause a serious error to be logged if 
+       it can't be loaded.
+    -->
+  <!--
+     <lib path="../a-jar-that-does-not-exist.jar" /> 
+  -->
+  
+  <!-- Data Directory
+
+       Used to specify an alternate directory to hold all index data
+       other than the default ./data under the Solr home.  If
+       replication is in use, this should match the replication
+       configuration.
+    -->
+  <!--<dataDir>/data/3/collection1/data</dataDir>-->
+  <dataDir>${solr.data.dir:}</dataDir>
+ 
+
+  <!-- The DirectoryFactory to use for indexes.
+       
+       solr.StandardDirectoryFactory is filesystem
+       based and tries to pick the best implementation for the current
+       JVM and platform.  solr.NRTCachingDirectoryFactory, the default,
+       wraps solr.StandardDirectoryFactory and caches small files in memory
+       for better NRT performance.
+
+       One can force a particular implementation via solr.MMapDirectoryFactory,
+       solr.NIOFSDirectoryFactory, or solr.SimpleFSDirectoryFactory.
+
+       solr.RAMDirectoryFactory is memory based, not
+       persistent, and doesn't work with replication.
+    -->
+  <directoryFactory name="DirectoryFactory" 
+                    class="${solr.directoryFactory:solr.NRTCachingDirectoryFactory}"/> 
+
+  <!-- The CodecFactory for defining the format of the inverted index.
+       The default implementation is SchemaCodecFactory, which is the official Lucene
+       index format, but hooks into the schema to provide per-field customization of
+       the postings lists and per-document values in the fieldType element
+       (postingsFormat/docValuesFormat). Note that most of the alternative implementations
+       are experimental, so if you choose to customize the index format, its a good
+       idea to convert back to the official format e.g. via IndexWriter.addIndexes(IndexReader)
+       before upgrading to a newer version to avoid unnecessary reindexing.
+  -->
+  <codecFactory class="solr.SchemaCodecFactory"/>
+
+  <!-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+       Index Config - These settings control low-level behavior of indexing
+       Most example settings here show the default value, but are commented
+       out, to more easily see where customizations have been made.
+       
+       Note: This replaces <indexDefaults> and <mainIndex> from older versions
+       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -->
+  <indexConfig>
+    <!-- maxFieldLength was removed in 4.0. To get similar behavior, include a 
+         LimitTokenCountFilterFactory in your fieldType definition. E.g. 
+     <filter class="solr.LimitTokenCountFilterFactory" maxTokenCount="10000"/>
+    -->
+    <!-- Maximum time to wait for a write lock (ms) for an IndexWriter. Default: 1000 -->
+    <!-- <writeLockTimeout>1000</writeLockTimeout>  -->
+
+    <!-- The maximum number of simultaneous threads that may be
+         indexing documents at once in IndexWriter; if more than this
+         many threads arrive they will wait for others to finish.
+         Default in Solr/Lucene is 8. -->
+         <maxIndexingThreads>${solr.maxIndexingThreads:8}</maxIndexingThreads>
+    
+    <!-- Expert: Enabling compound file will use less files for the index, 
+         using fewer file descriptors on the expense of performance decrease. 
+         Default in Lucene is "true". Default in Solr is "false" (since 3.6) -->
+    <!-- <useCompoundFile>false</useCompoundFile> -->
+
+    <!-- ramBufferSizeMB sets the amount of RAM that may be used by Lucene
+         indexing for buffering added documents and deletions before they are
+         flushed to the Directory.
+         maxBufferedDocs sets a limit on the number of documents buffered
+         before flushing.
+         If both ramBufferSizeMB and maxBufferedDocs is set, then
+         Lucene will flush based on whichever limit is hit first.  -->
+    <ramBufferSizeMB>128</ramBufferSizeMB>
+    <!-- <maxBufferedDocs>1000</maxBufferedDocs> -->
+
+    <!-- Expert: Merge Policy 
+         The Merge Policy in Lucene controls how merging of segments is done.
+         The default since Solr/Lucene 3.3 is TieredMergePolicy.
+         The default since Lucene 2.3 was the LogByteSizeMergePolicy,
+         Even older versions of Lucene used LogDocMergePolicy.
+      -->
+    <!--
+        <mergePolicy class="org.apache.lucene.index.TieredMergePolicy">
+          <int name="maxMergeAtOnce">10</int>
+          <int name="segmentsPerTier">10</int>
+        </mergePolicy>
+      -->
+       
+    <!-- Merge Factor
+         The merge factor controls how many segments will get merged at a time.
+         For TieredMergePolicy, mergeFactor is a convenience parameter which
+         will set both MaxMergeAtOnce and SegmentsPerTier at once.
+         For LogByteSizeMergePolicy, mergeFactor decides how many new segments
+         will be allowed before they are merged into one.
+         Default is 10 for both merge policies.
+      -->
+    <!-- 
+    <mergeFactor>10</mergeFactor>
+      -->
+
+    <!-- Expert: Merge Scheduler
+         The Merge Scheduler in Lucene controls how merges are
+         performed.  The ConcurrentMergeScheduler (Lucene 2.3 default)
+         can perform merges in the background using separate threads.
+         The SerialMergeScheduler (Lucene 2.2 default) does not.
+     -->
+    <!-- 
+       <mergeScheduler class="org.apache.lucene.index.ConcurrentMergeScheduler"/>
+       -->
+
+    <!-- LockFactory 
+
+         This option specifies which Lucene LockFactory implementation
+         to use.
+      
+         single = SingleInstanceLockFactory - suggested for a
+                  read-only index or when there is no possibility of
+                  another process trying to modify the index.
+         native = NativeFSLockFactory - uses OS native file locking.
+                  Do not use when multiple solr webapps in the same
+                  JVM are attempting to share a single index.
+         simple = SimpleFSLockFactory  - uses a plain file for locking
+
+         Defaults: 'native' is default for Solr3.6 and later, otherwise
+                   'simple' is the default
+
+         More details on the nuances of each LockFactory...
+         http://wiki.apache.org/lucene-java/AvailableLockFactories
+    -->
+    <!-- <lockType>native</lockType> -->
+
+    <!-- Unlock On Startup
+
+         If true, unlock any held write or commit locks on startup.
+         This defeats the locking mechanism that allows multiple
+         processes to safely access a lucene index, and should be used
+         with care. Default is "false".
+
+         This is not needed if lock type is 'none' or 'single'
+     -->
+    <!--
+    <unlockOnStartup>false</unlockOnStartup>
+      -->
+    
+    <!-- Expert: Controls how often Lucene loads terms into memory
+         Default is 128 and is likely good for most everyone.
+      -->
+    <!-- <termIndexInterval>128</termIndexInterval> -->
+
+    <!-- If true, IndexReaders will be reopened (often more efficient)
+         instead of closed and then opened. Default: true
+      -->
+    <!-- 
+    <reopenReaders>true</reopenReaders>
+      -->
+
+    <!-- Commit Deletion Policy
+
+         Custom deletion policies can be specified here. The class must
+         implement org.apache.lucene.index.IndexDeletionPolicy.
+
+         http://lucene.apache.org/java/3_5_0/api/core/org/apache/lucene/index/IndexDeletionPolicy.html
+
+         The default Solr IndexDeletionPolicy implementation supports
+         deleting index commit points on number of commits, age of
+         commit point and optimized status.
+         
+         The latest commit point should always be preserved regardless
+         of the criteria.
+    -->
+    <!-- 
+    <deletionPolicy class="solr.SolrDeletionPolicy">
+    -->
+      <!-- The number of commit points to be kept -->
+      <!-- <str name="maxCommitsToKeep">1</str> -->
+      <!-- The number of optimized commit points to be kept -->
+      <!-- <str name="maxOptimizedCommitsToKeep">0</str> -->
+      <!--
+          Delete all commit points once they have reached the given age.
+          Supports DateMathParser syntax e.g.
+        -->
+      <!--
+         <str name="maxCommitAge">30MINUTES</str>
+         <str name="maxCommitAge">1DAY</str>
+      -->
+    <!-- 
+    </deletionPolicy>
+    -->
+
+    <!-- Lucene Infostream
+       
+         To aid in advanced debugging, Lucene provides an "InfoStream"
+         of detailed information when indexing.
+
+         Setting The value to true will instruct the underlying Lucene
+         IndexWriter to write its debugging info the specified file
+      -->
+     <!-- <infoStream file="INFOSTREAM.txt">false</infoStream> --> 
+  </indexConfig>
+
+
+  <!-- JMX
+       
+       This example enables JMX if and only if an existing MBeanServer
+       is found, use this if you want to configure JMX through JVM
+       parameters. Remove this to disable exposing Solr configuration
+       and statistics to JMX.
+
+       For more details see http://wiki.apache.org/solr/SolrJmx
+    -->
+  <jmx />
+  <!-- If you want to connect to a particular server, specify the
+       agentId 
+    -->
+  <!-- <jmx agentId="myAgent" /> -->
+  <!-- If you want to start a new MBeanServer, specify the serviceUrl -->
+  <!-- <jmx serviceUrl="service:jmx:rmi:///jndi/rmi://localhost:9999/solr"/>
+    -->
+
+  <!-- The default high-performance update handler -->
+  <updateHandler class="solr.DirectUpdateHandler2">
+
+    <!-- Enables a transaction log, used for real-time get, durability, and
+         and solr cloud replica recovery.  The log can grow as big as
+         uncommitted changes to the index, so use of a hard autoCommit
+         is recommended (see below).
+         "dir" - the target directory for transaction logs, defaults to the
+                solr data directory.  --> 
+    <updateLog>
+      <str name="dir">${solr.ulog.dir:}</str>
+    </updateLog>
+ 
+    <!-- AutoCommit
+
+         Perform a hard commit automatically under certain conditions.
+         Instead of enabling autoCommit, consider using "commitWithin"
+         when adding documents. 
+
+         http://wiki.apache.org/solr/UpdateXmlMessages
+
+         maxDocs - Maximum number of documents to add since the last
+                   commit before automatically triggering a new commit.
+
+         maxTime - Maximum amount of time in ms that is allowed to pass
+                   since a document was added before automaticly
+                   triggering a new commit. 
+         openSearcher - if false, the commit causes recent index changes
+         to be flushed to stable storage, but does not cause a new
+         searcher to be opened to make those changes visible.
+      -->
+     <autoCommit> 
+       <maxTime>${solr.autoCommit.maxTime:60000}</maxTime> 
+       <openSearcher>false</openSearcher> 
+     </autoCommit>
+
+    <!-- softAutoCommit is like autoCommit except it causes a
+         'soft' commit which only ensures that changes are visible
+         but does not ensure that data is synced to disk.  This is
+         faster and more near-realtime friendly than a hard commit.
+      -->
+     
+       <autoSoftCommit> 
+         <maxTime>${solr.autoSoftCommit.maxTime:1000}</maxTime> 
+       </autoSoftCommit>
+      
+
+    <!-- Update Related Event Listeners
+         
+         Various IndexWriter related events can trigger Listeners to
+         take actions.
+
+         postCommit - fired after every commit or optimize command
+         postOptimize - fired after every optimize command
+      -->
+    <!-- The RunExecutableListener executes an external command from a
+         hook such as postCommit or postOptimize.
+         
+         exe - the name of the executable to run
+         dir - dir to use as the current working directory. (default=".")
+         wait - the calling thread waits until the executable returns. 
+                (default="true")
+         args - the arguments to pass to the program.  (default is none)
+         env - environment variables to set.  (default is none)
+      -->
+    <!-- This example shows how RunExecutableListener could be used
+         with the script based replication...
+         http://wiki.apache.org/solr/CollectionDistribution
+      -->
+    <!--
+       <listener event="postCommit" class="solr.RunExecutableListener">
+         <str name="exe">solr/bin/snapshooter</str>
+         <str name="dir">.</str>
+         <bool name="wait">true</bool>
+         <arr name="args"> <str>arg1</str> <str>arg2</str> </arr>
+         <arr name="env"> <str>MYVAR=val1</str> </arr>
+       </listener>
+      -->
+
+
+  </updateHandler>
+  
+  <!-- IndexReaderFactory
+
+       Use the following format to specify a custom IndexReaderFactory,
+       which allows for alternate IndexReader implementations.
+
+       ** Experimental Feature **
+
+       Please note - Using a custom IndexReaderFactory may prevent
+       certain other features from working. The API to
+       IndexReaderFactory may change without warning or may even be
+       removed from future releases if the problems cannot be
+       resolved.
+
+
+       ** Features that may not work with custom IndexReaderFactory **
+
+       The ReplicationHandler assumes a disk-resident index. Using a
+       custom IndexReader implementation may cause incompatibility
+       with ReplicationHandler and may cause replication to not work
+       correctly. See SOLR-1366 for details.
+
+    -->
+  <!--
+  <indexReaderFactory name="IndexReaderFactory" class="package.class">
+    <str name="someArg">Some Value</str>
+  </indexReaderFactory >
+  -->
+  <!-- By explicitly declaring the Factory, the termIndexDivisor can
+       be specified.
+    -->
+  <!--
+     <indexReaderFactory name="IndexReaderFactory" 
+                         class="solr.StandardIndexReaderFactory">
+       <int name="setTermIndexDivisor">12</int>
+     </indexReaderFactory >
+    -->
+
+  <!-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+       Query section - these settings control query time things like caches
+       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -->
+  <query>
+    <!-- Max Boolean Clauses
+
+         Maximum number of clauses in each BooleanQuery,  an exception
+         is thrown if exceeded.
+
+         ** WARNING **
+         
+         This option actually modifies a global Lucene property that
+         will affect all SolrCores.  If multiple solrconfig.xml files
+         disagree on this property, the value at any given moment will
+         be based on the last SolrCore to be initialized.
+         
+      -->
+    <maxBooleanClauses>1024</maxBooleanClauses>
+
+
+    <!-- Solr Internal Query Caches
+
+         There are two implementations of cache available for Solr,
+         LRUCache, based on a synchronized LinkedHashMap, and
+         FastLRUCache, based on a ConcurrentHashMap.  
+
+         FastLRUCache has faster gets and slower puts in single
+         threaded operation and thus is generally faster than LRUCache
+         when the hit ratio of the cache is high (> 75%), and may be
+         faster under other scenarios on multi-cpu systems.
+    -->
+
+    <!-- Filter Cache
+
+         Cache used by SolrIndexSearcher for filters (DocSets),
+         unordered sets of *all* documents that match a query.  When a
+         new searcher is opened, its caches may be prepopulated or
+         "autowarmed" using data from caches in the old searcher.
+         autowarmCount is the number of items to prepopulate.  For
+         LRUCache, the autowarmed items will be the most recently
+         accessed items.
+
+         Parameters:
+           class - the SolrCache implementation LRUCache or
+               (LRUCache or FastLRUCache)
+           size - the maximum number of entries in the cache
+           initialSize - the initial capacity (number of entries) of
+               the cache.  (see java.util.HashMap)
+           autowarmCount - the number of entries to prepopulate from
+               and old cache.  
+      -->
+    <filterCache class="solr.FastLRUCache"
+                 size="512"
+                 initialSize="512"
+                 autowarmCount="0"/>
+
+    <!-- Query Result Cache
+         
+         Caches results of searches - ordered lists of document ids
+         (DocList) based on a query, a sort, and the range of documents requested.  
+      -->
+    <queryResultCache class="solr.LRUCache"
+                     size="512"
+                     initialSize="512"
+                     autowarmCount="0"/>
+   
+    <!-- Document Cache
+
+         Caches Lucene Document objects (the stored fields for each
+         document).  Since Lucene internal document ids are transient,
+         this cache will not be autowarmed.  
+      -->
+    <documentCache class="solr.LRUCache"
+                   size="512"
+                   initialSize="512"
+                   autowarmCount="0"/>
+    
+    <!-- Field Value Cache
+         
+         Cache used to hold field values that are quickly accessible
+         by document id.  The fieldValueCache is created by default
+         even if not configured here.
+      -->
+    <!--
+       <fieldValueCache class="solr.FastLRUCache"
+                        size="512"
+                        autowarmCount="128"
+                        showItems="32" />
+      -->
+
+    <!-- Custom Cache
+
+         Example of a generic cache.  These caches may be accessed by
+         name through SolrIndexSearcher.getCache(),cacheLookup(), and
+         cacheInsert().  The purpose is to enable easy caching of
+         user/application level data.  The regenerator argument should
+         be specified as an implementation of solr.CacheRegenerator 
+         if autowarming is desired.  
+      -->
+    <!--
+       <cache name="myUserCache"
+              class="solr.LRUCache"
+              size="4096"
+              initialSize="1024"
+              autowarmCount="1024"
+              regenerator="com.mycompany.MyRegenerator"
+              />
+      -->
+
+
+    <!-- Lazy Field Loading
+
+         If true, stored fields that are not requested will be loaded
+         lazily.  This can result in a significant speed improvement
+         if the usual case is to not load all stored fields,
+         especially if the skipped fields are large compressed text
+         fields.
+    -->
+    <enableLazyFieldLoading>true</enableLazyFieldLoading>
+
+   <!-- Use Filter For Sorted Query
+
+        A possible optimization that attempts to use a filter to
+        satisfy a search.  If the requested sort does not include
+        score, then the filterCache will be checked for a filter
+        matching the query. If found, the filter will be used as the
+        source of document ids, and then the sort will be applied to
+        that.
+
+        For most situations, this will not be useful unless you
+        frequently get the same search repeatedly with different sort
+        options, and none of them ever use "score"
+     -->
+   <!--
+      <useFilterForSortedQuery>true</useFilterForSortedQuery>
+     -->
+
+   <!-- Result Window Size
+
+        An optimization for use with the queryResultCache.  When a search
+        is requested, a superset of the requested number of document ids
+        are collected.  For example, if a search for a particular query
+        requests matching documents 10 through 19, and queryWindowSize is 50,
+        then documents 0 through 49 will be collected and cached.  Any further
+        requests in that range can be satisfied via the cache.  
+     -->
+   <queryResultWindowSize>20</queryResultWindowSize>
+
+   <!-- Maximum number of documents to cache for any entry in the
+        queryResultCache. 
+     -->
+   <queryResultMaxDocsCached>200</queryResultMaxDocsCached>
+
+   <!-- Query Related Event Listeners
+
+        Various IndexSearcher related events can trigger Listeners to
+        take actions.
+
+        newSearcher - fired whenever a new searcher is being prepared
+        and there is a current searcher handling requests (aka
+        registered).  It can be used to prime certain caches to
+        prevent long request times for certain requests.
+
+        firstSearcher - fired whenever a new searcher is being
+        prepared but there is no current registered searcher to handle
+        requests or to gain autowarming data from.
+
+        
+     -->
+    <!-- QuerySenderListener takes an array of NamedList and executes a
+         local query request for each NamedList in sequence. 
+      -->
+    <listener event="newSearcher" class="solr.QuerySenderListener">
+      <arr name="queries">
+        <!--
+           <lst><str name="q">solr</str><str name="sort">price asc</str></lst>
+           <lst><str name="q">rocks</str><str name="sort">weight asc</str></lst>
+          -->
+      </arr>
+    </listener>
+    <listener event="firstSearcher" class="solr.QuerySenderListener">
+      <arr name="queries">
+        <lst>
+          <str name="q">static firstSearcher warming in solrconfig.xml</str>
+        </lst>
+      </arr>
+    </listener>
+
+    <!-- Use Cold Searcher
+
+         If a search request comes in and there is no current
+         registered searcher, then immediately register the still
+         warming searcher and use it.  If "false" then all requests
+         will block until the first searcher is done warming.
+      -->
+    <useColdSearcher>false</useColdSearcher>
+
+    <!-- Max Warming Searchers
+         
+         Maximum number of searchers that may be warming in the
+         background concurrently.  An error is returned if this limit
+         is exceeded.
+
+         Recommend values of 1-2 for read-only slaves, higher for
+         masters w/o cache warming.
+      -->
+    <maxWarmingSearchers>4</maxWarmingSearchers>
+
+  </query>
+
+
+  <!-- Request Dispatcher
+
+       This section contains instructions for how the SolrDispatchFilter
+       should behave when processing requests for this SolrCore.
+
+       handleSelect is a legacy option that affects the behavior of requests
+       such as /select?qt=XXX
+
+       handleSelect="true" will cause the SolrDispatchFilter to process
+       the request and dispatch the query to a handler specified by the
+       "qt" param, assuming "/select" isn't already registered.
+
+       handleSelect="false" will cause the SolrDispatchFilter to
+       ignore "/select" requests, resulting in a 404 unless a handler
+       is explicitly registered with the name "/select"
+
+       handleSelect="true" is not recommended for new users, but is the default
+       for backwards compatibility
+    -->
+  <requestDispatcher handleSelect="false" >
+    <!-- Request Parsing
+
+         These settings indicate how Solr Requests may be parsed, and
+         what restrictions may be placed on the ContentStreams from
+         those requests
+
+         enableRemoteStreaming - enables use of the stream.file
+         and stream.url parameters for specifying remote streams.
+
+         multipartUploadLimitInKB - specifies the max size of
+         Multipart File Uploads that Solr will allow in a Request.
+         
+         *** WARNING ***
+         The settings below authorize Solr to fetch remote files, You
+         should make sure your system has some authentication before
+         using enableRemoteStreaming="true"
+
+      --> 
+    <requestParsers enableRemoteStreaming="true" 
+                    multipartUploadLimitInKB="2048000"
+                    formdataUploadLimitInKB="2048"/>
+
+    <!-- HTTP Caching
+
+         Set HTTP caching related parameters (for proxy caches and clients).
+
+         The options below instruct Solr not to output any HTTP Caching
+         related headers
+      -->
+    <httpCaching never304="true" />
+    <!-- If you include a <cacheControl> directive, it will be used to
+         generate a Cache-Control header (as well as an Expires header
+         if the value contains "max-age=")
+         
+         By default, no Cache-Control header is generated.
+         
+         You can use the <cacheControl> option even if you have set
+         never304="true"
+      -->
+    <!--
+       <httpCaching never304="true" >
+         <cacheControl>max-age=30, public</cacheControl> 
+       </httpCaching>
+      -->
+    <!-- To enable Solr to respond with automatically generated HTTP
+         Caching headers, and to response to Cache Validation requests
+         correctly, set the value of never304="false"
+         
+         This will cause Solr to generate Last-Modified and ETag
+         headers based on the properties of the Index.
+
+         The following options can also be specified to affect the
+         values of these headers...
+
+         lastModFrom - the default value is "openTime" which means the
+         Last-Modified value (and validation against If-Modified-Since
+         requests) will all be relative to when the current Searcher
+         was opened.  You can change it to lastModFrom="dirLastMod" if
+         you want the value to exactly correspond to when the physical
+         index was last modified.
+
+         etagSeed="..." is an option you can change to force the ETag
+         header (and validation against If-None-Match requests) to be
+         different even if the index has not changed (ie: when making
+         significant changes to your config file)
+
+         (lastModifiedFrom and etagSeed are both ignored if you use
+         the never304="true" option)
+      -->
+    <!--
+       <httpCaching lastModifiedFrom="openTime"
+                    etagSeed="Solr">
+         <cacheControl>max-age=30, public</cacheControl> 
+       </httpCaching>
+      -->
+  </requestDispatcher>
+
+  <!-- Request Handlers 
+
+       http://wiki.apache.org/solr/SolrRequestHandler
+
+       Incoming queries will be dispatched to a specific handler by name
+       based on the path specified in the request.
+
+       Legacy behavior: If the request path uses "/select" but no Request
+       Handler has that name, and if handleSelect="true" has been specified in
+       the requestDispatcher, then the Request Handler is dispatched based on
+       the qt parameter.  Handlers without a leading '/' are accessed this way
+       like so: http://host/app/[core/]select?qt=name  If no qt is
+       given, then the requestHandler that declares default="true" will be
+       used or the one named "standard".
+       
+       If a Request Handler is declared with startup="lazy", then it will
+       not be initialized until the first request that uses it.
+
+    -->
+  <!-- SearchHandler
+
+       http://wiki.apache.org/solr/SearchHandler
+
+       For processing Search Queries, the primary Request Handler
+       provided with Solr is "SearchHandler" It delegates to a sequent
+       of SearchComponents (see below) and supports distributed
+       queries across multiple shards
+    -->
+  <requestHandler name="/select" class="solr.SearchHandler">
+    <!-- default values for query parameters can be specified, these
+         will be overridden by parameters in the request
+      -->
+     <lst name="defaults">
+       <str name="echoParams">explicit</str>
+       <int name="rows">10</int>
+       <str name="df">text</str>
+     </lst>
+    <!-- In addition to defaults, "appends" params can be specified
+         to identify values which should be appended to the list of
+         multi-val params from the query (or the existing "defaults").
+      -->
+    <!-- In this example, the param "fq=instock:true" would be appended to
+         any query time fq params the user may specify, as a mechanism for
+         partitioning the index, independent of any user selected filtering
+         that may also be desired (perhaps as a result of faceted searching).
+
+         NOTE: there is *absolutely* nothing a client can do to prevent these
+         "appends" values from being used, so don't use this mechanism
+         unless you are sure you always want it.
+      -->
+    <!--
+       <lst name="appends">
+         <str name="fq">inStock:true</str>
+       </lst>
+      -->
+    <!-- "invariants" are a way of letting the Solr maintainer lock down
+         the options available to Solr clients.  Any params values
+         specified here are used regardless of what values may be specified
+         in either the query, the "defaults", or the "appends" params.
+
+         In this example, the facet.field and facet.query params would
+         be fixed, limiting the facets clients can use.  Faceting is
+         not turned on by default - but if the client does specify
+         facet=true in the request, these are the only facets they
+         will be able to see counts for; regardless of what other
+         facet.field or facet.query params they may specify.
+
+         NOTE: there is *absolutely* nothing a client can do to prevent these
+         "invariants" values from being used, so don't use this mechanism
+         unless you are sure you always want it.
+      -->
+    <!--
+       <lst name="invariants">
+         <str name="facet.field">cat</str>
+         <str name="facet.field">manu_exact</str>
+         <str name="facet.query">price:[* TO 500]</str>
+         <str name="facet.query">price:[500 TO *]</str>
+       </lst>
+      -->
+    <!-- If the default list of SearchComponents is not desired, that
+         list can either be overridden completely, or components can be
+         prepended or appended to the default list.  (see below)
+      -->
+    <!--
+       <arr name="components">
+         <str>nameOfCustomComponent1</str>
+         <str>nameOfCustomComponent2</str>
+       </arr>
+      -->
+    </requestHandler>
+
+  <!-- A request handler that returns indented JSON by default -->
+  <requestHandler name="/query" class="solr.SearchHandler">
+     <lst name="defaults">
+       <str name="echoParams">explicit</str>
+       <str name="wt">json</str>
+       <str name="indent">true</str>
+       <str name="df">text</str>
+     </lst>
+  </requestHandler>
+
+
+  <!-- realtime get handler, guaranteed to return the latest stored fields of
+       any document, without the need to commit or open a new searcher.  The
+       current implementation relies on the updateLog feature being enabled. -->
+  <requestHandler name="/get" class="solr.RealTimeGetHandler">
+     <lst name="defaults">
+       <str name="omitHeader">true</str>
+       <str name="wt">json</str>
+       <str name="indent">true</str>
+     </lst>
+  </requestHandler>
+
+ 
+  <!-- A Robust Example 
+       
+       This example SearchHandler declaration shows off usage of the
+       SearchHandler with many defaults declared
+
+       Note that multiple instances of the same Request Handler
+       (SearchHandler) can be registered multiple times with different
+       names (and different init parameters)
+    -->
+  <requestHandler name="/browse" class="solr.SearchHandler">
+     <lst name="defaults">
+       <str name="echoParams">explicit</str>
+
+       <!-- VelocityResponseWriter settings -->
+       <str name="wt">velocity</str>
+       <str name="v.template">browse</str>
+       <str name="v.layout">layout</str>
+       <str name="title">Solritas</str>
+
+       <!-- Query settings -->
+       <str name="defType">edismax</str>
+       <str name="qf">
+          text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4
+          title^10.0 description^5.0 keywords^5.0 author^2.0 resourcename^1.0
+       </str>
+       <str name="df">text</str>
+       <str name="mm">100%</str>
+       <str name="q.alt">*:*</str>
+       <str name="rows">10</str>
+       <str name="fl">*,score</str>
+
+       <str name="mlt.qf">
+         text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4
+         title^10.0 description^5.0 keywords^5.0 author^2.0 resourcename^1.0
+       </str>
+       <str name="mlt.fl">text,features,name,sku,id,manu,cat,title,description,keywords,author,resourcename</str>
+       <int name="mlt.count">3</int>
+
+       <!-- Faceting defaults -->
+       <str name="facet">on</str>
+       <str name="facet.field">cat</str>
+       <str name="facet.field">manu_exact</str>
+       <str name="facet.field">content_type</str>
+       <str name="facet.field">author_s</str>
+       <str name="facet.query">ipod</str>
+       <str name="facet.query">GB</str>
+       <str name="facet.mincount">1</str>
+       <str name="facet.pivot">cat,inStock</str>
+       <str name="facet.range.other">after</str>
+       <str name="facet.range">price</str>
+       <int name="f.price.facet.range.start">0</int>
+       <int name="f.price.facet.range.end">600</int>
+       <int name="f.price.facet.range.gap">50</int>
+       <str name="facet.range">popularity</str>
+       <int name="f.popularity.facet.range.start">0</int>
+       <int name="f.popularity.facet.range.end">10</int>
+       <int name="f.popularity.facet.range.gap">3</int>
+       <str name="facet.range">manufacturedate_dt</str>
+       <str name="f.manufacturedate_dt.facet.range.start">NOW/YEAR-10YEARS</str>
+       <str name="f.manufacturedate_dt.facet.range.end">NOW</str>
+       <str name="f.manufacturedate_dt.facet.range.gap">+1YEAR</str>
+       <str name="f.manufacturedate_dt.facet.range.other">before</str>
+       <str name="f.manufacturedate_dt.facet.range.other">after</str>
+
+       <!-- Highlighting defaults -->
+       <str name="hl">on</str>
+       <str name="hl.fl">content features title name</str>
+       <str name="hl.encoder">html</str>
+       <str name="hl.simple.pre">&lt;b&gt;</str>
+       <str name="hl.simple.post">&lt;/b&gt;</str>
+       <str name="f.title.hl.fragsize">0</str>
+       <str name="f.title.hl.alternateField">title</str>
+       <str name="f.name.hl.fragsize">0</str>
+       <str name="f.name.hl.alternateField">name</str>
+       <str name="f.content.hl.snippets">3</str>
+       <str name="f.content.hl.fragsize">200</str>
+       <str name="f.content.hl.alternateField">content</str>
+       <str name="f.content.hl.maxAlternateFieldLength">750</str>
+
+       <!-- Spell checking defaults -->
+       <str name="spellcheck">on</str>
+       <str name="spellcheck.extendedResults">false</str>       
+       <str name="spellcheck.count">5</str>
+       <str name="spellcheck.alternativeTermCount">2</str>
+       <str name="spellcheck.maxResultsForSuggest">5</str>       
+       <str name="spellcheck.collate">true</str>
+       <str name="spellcheck.collateExtendedResults">true</str>  
+       <str name="spellcheck.maxCollationTries">5</str>
+       <str name="spellcheck.maxCollations">3</str>           
+     </lst>
+
+     <!-- append spellchecking to our list of components -->
+     <arr name="last-components">
+       <str>spellcheck</str>
+     </arr>
+  </requestHandler>
+
+
+  <!-- Update Request Handler.  
+       
+       http://wiki.apache.org/solr/UpdateXmlMessages
+
+       The canonical Request Handler for Modifying the Index through
+       commands specified using XML, JSON, CSV, or JAVABIN
+
+       Note: Since solr1.1 requestHandlers requires a valid content
+       type header if posted in the body. For example, curl now
+       requires: -H 'Content-type:text/xml; charset=utf-8'
+       
+       To override the request content type and force a specific 
+       Content-type, use the request parameter: 
+         ?update.contentType=text/csv
+       
+       This handler will pick a response format to match the input
+       if the 'wt' parameter is not explicit
+    -->
+  <requestHandler name="/update" class="solr.UpdateRequestHandler">
+    <!-- See below for information on defining 
+         updateRequestProcessorChains that can be used by name 
+         on each Update Request
+      -->
+    <!--
+       <lst name="defaults">
+         <str name="update.chain">dedupe</str>
+       </lst>
+       -->
+  </requestHandler>
+
+  <!-- for back compat with clients using /update/json and /update/csv -->  
+  <requestHandler name="/update/json" class="solr.JsonUpdateRequestHandler">
+        <lst name="defaults">
+         <str name="stream.contentType">application/json</str>
+       </lst>
+  </requestHandler>
+  <requestHandler name="/update/csv" class="solr.CSVRequestHandler">
+        <lst name="defaults">
+         <str name="stream.contentType">application/csv</str>
+       </lst>
+  </requestHandler>
+
+  <!-- Solr Cell Update Request Handler
+
+       http://wiki.apache.org/solr/ExtractingRequestHandler 
+
+    -->
+  <requestHandler name="/update/extract" 
+                  startup="lazy"
+                  class="solr.extraction.ExtractingRequestHandler" >
+    <lst name="defaults">
+      <str name="lowernames">true</str>
+      <str name="uprefix">ignored_</str>
+      <!--<str name="uprefix">attr_</str>-->
+
+      <str name="fmap.content">text</str>
+
+      <!-- twitter feed schema -->
+      <str name="capture">id</str>
+      <str name="capture">user_friends_count</str>
+      <str name="capture">user_location</str>
+      <str name="capture">user_description</str>
+      <str name="capture">user_statuses_count</str>
+      <str name="capture">user_followers_count</str>
+      <str name="capture">user_name</str>
+      <str name="capture">user_screen_name</str>
+      <str name="capture">created_at</str>
+      <str name="capture">text</str>
+      <str name="capture">retweet_count</str>
+      <str name="capture">retweeted</str>
+      <str name="capture">in_reply_to_user_id</str>
+      <str name="capture">source</str>
+      <str name="capture">in_reply_to_status_id</str>
+      <str name="capture">media_url_https</str>
+      <str name="capture">expanded_url</str>
+      
+      <!-- file metadata -->   
+      <str name="capture">file_download_url</str>
+      <str name="capture">file_upload_url</str>
+      <str name="capture">file_scheme</str>
+      <str name="capture">file_host</str>
+      <str name="capture">file_port</str>
+      <str name="capture">file_path</str>
+      <str name="capture">file_name</str>
+      <str name="capture">file_length</str>
+      <str name="capture">file_last_modified</str>
+      <str name="capture">file_owner</str>
+      <str name="capture">file_group</str>
+      <str name="capture">file_permissions_user</str>
+      <str name="capture">file_permissions_group</str>
+      <str name="capture">file_permissions_other</str>
+      <str name="capture">file_permissions_stickybit</str>
+      
+      <!-- tika metadata -->   
+      <str name="content-type">content_type</str>
+
+      <!-- capture link hrefs but ignore div attributes -->
+      <!--
+      <str name="captureAttr">true</str>
+      <str name="fmap.a">links</str>
+      <str name="fmap.div">ignored_</str>
+      -->            
+    </lst>
+  </requestHandler>
+
+
+  <!-- Field Analysis Request Handler
+
+       RequestHandler that provides much the same functionality as
+       analysis.jsp. Provides the ability to specify multiple field
+       types and field names in the same request and outputs
+       index-time and query-time analysis for each of them.
+
+       Request parameters are:
+       analysis.fieldname - field name whose analyzers are to be used
+
+       analysis.fieldtype - field type whose analyzers are to be used
+       analysis.fieldvalue - text for index-time analysis
+       q (or analysis.q) - text for query time analysis
+       analysis.showmatch (true|false) - When set to true and when
+           query analysis is performed, the produced tokens of the
+           field value analysis will be marked as "matched" for every
+           token that is produces by the query analysis
+   -->
+  <requestHandler name="/analysis/field" 
+                  startup="lazy"
+                  class="solr.FieldAnalysisRequestHandler" />
+
+
+  <!-- Document Analysis Handler
+
+       http://wiki.apache.org/solr/AnalysisRequestHandler
+
+       An analysis handler that provides a breakdown of the analysis
+       process of provided documents. This handler expects a (single)
+       content stream with the following format:
+
+       <docs>
+         <doc>
+           <field name="id">1</field>
+           <field name="name">The Name</field>
+           <field name="text">The Text Value</field>
+         </doc>
+         <doc>...</doc>
+         <doc>...</doc>
+         ...
+       </docs>
+
+    Note: Each document must contain a field which serves as the
+    unique key. This key is used in the returned response to associate
+    an analysis breakdown to the analyzed document.
+
+    Like the FieldAnalysisRequestHandler, this handler also supports
+    query analysis by sending either an "analysis.query" or "q"
+    request parameter that holds the query text to be analyzed. It
+    also supports the "analysis.showmatch" parameter which when set to
+    true, all field tokens that match the query tokens will be marked
+    as a "match". 
+  -->
+  <requestHandler name="/analysis/document" 
+                  class="solr.DocumentAnalysisRequestHandler" 
+                  startup="lazy" />
+
+  <!-- Admin Handlers
+
+       Admin Handlers - This will register all the standard admin
+       RequestHandlers.  
+    -->
+  <requestHandler name="/admin/" 
+                  class="solr.admin.AdminHandlers" />
+  <!-- This single handler is equivalent to the following... -->
+  <!--
+     <requestHandler name="/admin/luke"       class="solr.admin.LukeRequestHandler" />
+     <requestHandler name="/admin/system"     class="solr.admin.SystemInfoHandler" />
+     <requestHandler name="/admin/plugins"    class="solr.admin.PluginInfoHandler" />
+     <requestHandler name="/admin/threads"    class="solr.admin.ThreadDumpHandler" />
+     <requestHandler name="/admin/properties" class="solr.admin.PropertiesRequestHandler" />
+     <requestHandler name="/admin/file"       class="solr.admin.ShowFileRequestHandler" >
+    -->
+  <!-- If you wish to hide files under ${solr.home}/conf, explicitly
+       register the ShowFileRequestHandler using: 
+    -->
+  <!--
+     <requestHandler name="/admin/file" 
+                     class="solr.admin.ShowFileRequestHandler" >
+       <lst name="invariants">
+         <str name="hidden">synonyms.txt</str> 
+         <str name="hidden">anotherfile.txt</str> 
+       </lst>
+     </requestHandler>
+    -->
+
+  <!-- ping/healthcheck -->
+  <requestHandler name="/admin/ping" class="solr.PingRequestHandler">
+    <lst name="invariants">
+      <str name="q">solrpingquery</str>
+    </lst>
+    <lst name="defaults">
+      <str name="echoParams">all</str>
+    </lst>
+    <!-- An optional feature of the PingRequestHandler is to configure the 
+         handler with a "healthcheckFile" which can be used to enable/disable 
+         the PingRequestHandler.
+         relative paths are resolved against the data dir 
+      -->
+    <!-- <str name="healthcheckFile">server-enabled.txt</str> -->
+  </requestHandler>
+
+  <!-- Echo the request contents back to the client -->
+  <requestHandler name="/debug/dump" class="solr.DumpRequestHandler" >
+    <lst name="defaults">
+     <str name="echoParams">explicit</str> 
+     <str name="echoHandler">true</str>
+    </lst>
+  </requestHandler>
+  
+  <!-- Solr Replication
+
+       The SolrReplicationHandler supports replicating indexes from a
+       "master" used for indexing and "slaves" used for queries.
+
+       http://wiki.apache.org/solr/SolrReplication 
+
+       It is also neccessary for SolrCloud to function (in Cloud mode, the 
+       replication handler is used to bulk transfer segments when nodes 
+       are added or need to recover).
+
+       https://wiki.apache.org/solr/SolrCloud/
+    -->
+  <requestHandler name="/replication" class="solr.ReplicationHandler" > 
+    <!--
+       To enable simple master/slave replication, uncomment one of the 
+       sections below, depending on wether this solr instance should be 
+       the "master" or a "slave".  If this instance is a "slave" you will 
+       also need to fill in the masterUrl to point to a real machine.
+    -->
+    <!--
+       <lst name="master">
+         <str name="replicateAfter">commit</str>
+         <str name="replicateAfter">startup</str>
+         <str name="confFiles">schema.xml,stopwords.txt</str>
+       </lst>
+    -->
+    <!--
+       <lst name="slave">
+         <str name="masterUrl">http://your-master-hostname:8983/solr</str>
+         <str name="pollInterval">00:00:60</str>
+       </lst>
+    -->
+  </requestHandler>
+
+  <!-- Search Components
+
+       Search components are registered to SolrCore and used by 
+       instances of SearchHandler (which can access them by name)
+       
+       By default, the following components are available:
+       
+       <searchComponent name="query"     class="solr.QueryComponent" />
+       <searchComponent name="facet"     class="solr.FacetComponent" />
+       <searchComponent name="mlt"       class="solr.MoreLikeThisComponent" />
+       <searchComponent name="highlight" class="solr.HighlightComponent" />
+       <searchComponent name="stats"     class="solr.StatsComponent" />
+       <searchComponent name="debug"     class="solr.DebugComponent" />
+   
+       Default configuration in a requestHandler would look like:
+
+       <arr name="components">
+         <str>query</str>
+         <str>facet</str>
+         <str>mlt</str>
+         <str>highlight</str>
+         <str>stats</str>
+         <str>debug</str>
+       </arr>
+
+       If you register a searchComponent to one of the standard names, 
+       that will be used instead of the default.
+
+       To insert components before or after the 'standard' components, use:
+    
+       <arr name="first-components">
+         <str>myFirstComponentName</str>
+       </arr>
+    
+       <arr name="last-components">
+         <str>myLastComponentName</str>
+       </arr>
+
+       NOTE: The component registered with the name "debug" will
+       always be executed after the "last-components" 
+       
+     -->
+  
+   <!-- Spell Check
+
+        The spell check component can return a list of alternative spelling
+        suggestions.  
+
+        http://wiki.apache.org/solr/SpellCheckComponent
+     -->
+  <searchComponent name="spellcheck" class="solr.SpellCheckComponent">
+
+    <str name="queryAnalyzerFieldType">textSpell</str>
+
+    <!-- Multiple "Spell Checkers" can be declared and used by this
+         component
+      -->
+
+    <!-- a spellchecker built from a field of the main index -->
+    <lst name="spellchecker">
+      <str name="name">default</str>
+      <str name="field">name</str>
+      <str name="classname">solr.DirectSolrSpellChecker</str>
+      <!-- the spellcheck distance measure used, the default is the internal levenshtein -->
+      <str name="distanceMeasure">internal</str>
+      <!-- minimum accuracy needed to be considered a valid spellcheck suggestion -->
+      <float name="accuracy">0.5</float>
+      <!-- the maximum #edits we consider when enumerating terms: can be 1 or 2 -->
+      <int name="maxEdits">2</int>
+      <!-- the minimum shared prefix when enumerating terms -->
+      <int name="minPrefix">1</int>
+      <!-- maximum number of inspections per result. -->
+      <int name="maxInspections">5</int>
+      <!-- minimum length of a query term to be considered for correction -->
+      <int name="minQueryLength">4</int>
+      <!-- maximum threshold of documents a query term can appear to be considered for correction -->
+      <float name="maxQueryFrequency">0.01</float>
+      <!-- uncomment this to require suggestions to occur in 1% of the documents
+      	<float name="thresholdTokenFrequency">.01</float>
+      -->
+    </lst>
+    
+    <!-- a spellchecker that can break or combine words.  See "/spell" handler below for usage -->
+    <lst name="spellchecker">
+      <str name="name">wordbreak</str>
+      <str name="classname">solr.WordBreakSolrSpellChecker</str>      
+      <str name="field">name</str>
+      <str name="combineWords">true</str>
+      <str name="breakWords">true</str>
+      <int name="maxChanges">10</int>
+    </lst>
+
+    <!-- a spellchecker that uses a different distance measure -->
+    <!--
+       <lst name="spellchecker">
+         <str name="name">jarowinkler</str>
+         <str name="field">spell</str>
+         <str name="classname">solr.DirectSolrSpellChecker</str>
+         <str name="distanceMeasure">
+           org.apache.lucene.search.spell.JaroWinklerDistance
+         </str>
+       </lst>
+     -->
+
+    <!-- a spellchecker that use an alternate comparator 
+
+         comparatorClass be one of:
+          1. score (default)
+          2. freq (Frequency first, then score)
+          3. A fully qualified class name
+      -->
+    <!--
+       <lst name="spellchecker">
+         <str name="name">freq</str>
+         <str name="field">lowerfilt</str>
+         <str name="classname">solr.DirectSolrSpellChecker</str>
+         <str name="comparatorClass">freq</str>
+      -->
+
+    <!-- A spellchecker that reads the list of words from a file -->
+    <!--
+       <lst name="spellchecker">
+         <str name="classname">solr.FileBasedSpellChecker</str>
+         <str name="name">file</str>
+         <str name="sourceLocation">spellings.txt</str>
+         <str name="characterEncoding">UTF-8</str>
+         <str name="spellcheckIndexDir">spellcheckerFile</str>
+       </lst>
+      -->
+  </searchComponent>
+
+  <!-- A request handler for demonstrating the spellcheck component.  
+
+       NOTE: This is purely as an example.  The whole purpose of the
+       SpellCheckComponent is to hook it into the request handler that
+       handles your normal user queries so that a separate request is
+       not needed to get suggestions.
+
+       IN OTHER WORDS, THERE IS REALLY GOOD CHANCE THE SETUP BELOW IS
+       NOT WHAT YOU WANT FOR YOUR PRODUCTION SYSTEM!
+       
+       See http://wiki.apache.org/solr/SpellCheckComponent for details
+       on the request parameters.
+    -->
+  <requestHandler name="/spell" class="solr.SearchHandler" startup="lazy">
+    <lst name="defaults">
+      <str name="df">text</str>
+      <!-- Solr will use suggestions from both the 'default' spellchecker
+           and from the 'wordbreak' spellchecker and combine them.
+           collations (re-written queries) can include a combination of
+           corrections from both spellcheckers -->
+      <str name="spellcheck.dictionary">default</str>
+      <str name="spellcheck.dictionary">wordbreak</str>
+      <str name="spellcheck">on</str>
+      <str name="spellcheck.extendedResults">true</str>       
+      <str name="spellcheck.count">10</str>
+      <str name="spellcheck.alternativeTermCount">5</str>
+      <str name="spellcheck.maxResultsForSuggest">5</str>       
+      <str name="spellcheck.collate">true</str>
+      <str name="spellcheck.collateExtendedResults">true</str>  
+      <str name="spellcheck.maxCollationTries">10</str>
+      <str name="spellcheck.maxCollations">5</str>         
+    </lst>
+    <arr name="last-components">
+      <str>spellcheck</str>
+    </arr>
+  </requestHandler>
+
+  <!-- Term Vector Component
+
+       http://wiki.apache.org/solr/TermVectorComponent
+    -->
+  <searchComponent name="tvComponent" class="solr.TermVectorComponent"/>
+
+  <!-- A request handler for demonstrating the term vector component
+
+       This is purely as an example.
+
+       In reality you will likely want to add the component to your 
+       already specified request handlers. 
+    -->
+  <requestHandler name="/tvrh" class="solr.SearchHandler" startup="lazy">
+    <lst name="defaults">
+      <str name="df">text</str>
+      <bool name="tv">true</bool>
+    </lst>
+    <arr name="last-components">
+      <str>tvComponent</str>
+    </arr>
+  </requestHandler>
+
+  <!-- Clustering Component
+
+       http://wiki.apache.org/solr/ClusteringComponent
+
+       You'll need to set the solr.cluster.enabled system property
+       when running solr to run with clustering enabled:
+
+            java -Dsolr.clustering.enabled=true -jar start.jar
+
+    -->
+  <searchComponent name="clustering"
+                   enable="${solr.clustering.enabled:false}"
+                   class="solr.clustering.ClusteringComponent" >
+    <!-- Declare an engine -->
+    <lst name="engine">
+      <!-- The name, only one can be named "default" -->
+      <str name="name">default</str>
+
+      <!-- Class name of Carrot2 clustering algorithm.
+
+           Currently available algorithms are:
+           
+           * org.carrot2.clustering.lingo.LingoClusteringAlgorithm
+           * org.carrot2.clustering.stc.STCClusteringAlgorithm
+           * org.carrot2.clustering.kmeans.BisectingKMeansClusteringAlgorithm
+           
+           See http://project.carrot2.org/algorithms.html for the
+           algorithm's characteristics.
+        -->
+      <str name="carrot.algorithm">org.carrot2.clustering.lingo.LingoClusteringAlgorithm</str>
+
+      <!-- Overriding values for Carrot2 default algorithm attributes.
+
+           For a description of all available attributes, see:
+           http://download.carrot2.org/stable/manual/#chapter.components.
+           Use attribute key as name attribute of str elements
+           below. These can be further overridden for individual
+           requests by specifying attribute key as request parameter
+           name and attribute value as parameter value.
+        -->
+      <str name="LingoClusteringAlgorithm.desiredClusterCountBase">20</str>
+
+      <!-- Location of Carrot2 lexical resources.
+
+           A directory from which to load Carrot2-specific stop words
+           and stop labels. Absolute or relative to Solr config directory.
+           If a specific resource (e.g. stopwords.en) is present in the
+           specified dir, it will completely override the corresponding
+           default one that ships with Carrot2.
+
+           For an overview of Carrot2 lexical resources, see:
+           http://download.carrot2.org/head/manual/#chapter.lexical-resources
+        -->
+      <str name="carrot.lexicalResourcesDir">clustering/carrot2</str>
+
+      <!-- The language to assume for the documents.
+
+           For a list of allowed values, see:
+           http://download.carrot2.org/stable/manual/#section.attribute.lingo.MultilingualClustering.defaultLanguage
+       -->
+      <str name="MultilingualClustering.defaultLanguage">ENGLISH</str>
+    </lst>
+    <lst name="engine">
+      <str name="name">stc</str>
+      <str name="carrot.algorithm">org.carrot2.clustering.stc.STCClusteringAlgorithm</str>
+    </lst>
+  </searchComponent>
+
+  <!-- A request handler for demonstrating the clustering component
+
+       This is purely as an example.
+
+       In reality you will likely want to add the component to your 
+       already specified request handlers. 
+    -->
+  <requestHandler name="/clustering"
+                  startup="lazy"
+                  enable="${solr.clustering.enabled:false}"
+                  class="solr.SearchHandler">
+    <lst name="defaults">
+      <bool name="clustering">true</bool>
+      <str name="clustering.engine">default</str>
+      <bool name="clustering.results">true</bool>
+      <!-- The title field -->
+      <str name="carrot.title">name</str>
+      <str name="carrot.url">id</str>
+      <!-- The field to cluster on -->
+       <str name="carrot.snippet">features</str>
+       <!-- produce summaries -->
+       <bool name="carrot.produceSummary">true</bool>
+       <!-- the maximum number of labels per cluster -->
+       <!--<int name="carrot.numDescriptions">5</int>-->
+       <!-- produce sub clusters -->
+       <bool name="carrot.outputSubClusters">false</bool>
+       
+       <str name="defType">edismax</str>
+       <str name="qf">
+         text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4
+       </str>
+       <str name="q.alt">*:*</str>
+       <str name="rows">10</str>
+       <str name="fl">*,score</str>
+    </lst>     
+    <arr name="last-components">
+      <str>clustering</str>
+    </arr>
+  </requestHandler>
+  
+  <!-- Terms Component
+
+       http://wiki.apache.org/solr/TermsComponent
+
+       A component to return terms and document frequency of those
+       terms
+    -->
+  <searchComponent name="terms" class="solr.TermsComponent"/>
+
+  <!-- A request handler for demonstrating the terms component -->
+  <requestHandler name="/terms" class="solr.SearchHandler" startup="lazy">
+     <lst name="defaults">
+      <bool name="terms">true</bool>
+    </lst>     
+    <arr name="components">
+      <str>terms</str>
+    </arr>
+  </requestHandler>
+
+
+  <!-- Query Elevation Component
+
+       http://wiki.apache.org/solr/QueryElevationComponent
+
+       a search component that enables you to configure the top
+       results for a given query regardless of the normal lucene
+       scoring.
+    -->
+  <searchComponent name="elevator" class="solr.QueryElevationComponent" >
+    <!-- pick a fieldType to analyze queries -->
+    <str name="queryFieldType">string</str>
+    <str name="config-file">elevate.xml</str>
+  </searchComponent>
+
+  <!-- A request handler for demonstrating the elevator component -->
+  <requestHandler name="/elevate" class="solr.SearchHandler" startup="lazy">
+    <lst name="defaults">
+      <str name="echoParams">explicit</str>
+      <str name="df">text</str>
+    </lst>
+    <arr name="last-components">
+      <str>elevator</str>
+    </arr>
+  </requestHandler>
+
+  <!-- Highlighting Component
+
+       http://wiki.apache.org/solr/HighlightingParameters
+    -->
+  <searchComponent class="solr.HighlightComponent" name="highlight">
+    <highlighting>
+      <!-- Configure the standard fragmenter -->
+      <!-- This could most likely be commented out in the "default" case -->
+      <fragmenter name="gap" 
+                  default="true"
+                  class="solr.highlight.GapFragmenter">
+        <lst name="defaults">
+          <int name="hl.fragsize">100</int>
+        </lst>
+      </fragmenter>
+
+      <!-- A regular-expression-based fragmenter 
+           (for sentence extraction) 
+        -->
+      <fragmenter name="regex" 
+                  class="solr.highlight.RegexFragmenter">
+        <lst name="defaults">
+          <!-- slightly smaller fragsizes work better because of slop -->
+          <int name="hl.fragsize">70</int>
+          <!-- allow 50% slop on fragment sizes -->
+          <float name="hl.regex.slop">0.5</float>
+          <!-- a basic sentence pattern -->
+          <str name="hl.regex.pattern">[-\w ,/\n\&quot;&apos;]{20,200}</str>
+        </lst>
+      </fragmenter>
+
+      <!-- Configure the standard formatter -->
+      <formatter name="html" 
+                 default="true"
+                 class="solr.highlight.HtmlFormatter">
+        <lst name="defaults">
+          <str name="hl.simple.pre"><![CDATA[<em>]]></str>
+          <str name="hl.simple.post"><![CDATA[</em>]]></str>
+        </lst>
+      </formatter>
+
+      <!-- Configure the standard encoder -->
+      <encoder name="html" 
+               class="solr.highlight.HtmlEncoder" />
+
+      <!-- Configure the standard fragListBuilder -->
+      <fragListBuilder name="simple" 
+                       class="solr.highlight.SimpleFragListBuilder"/>
+      
+      <!-- Configure the single fragListBuilder -->
+      <fragListBuilder name="single" 
+                       class="solr.highlight.SingleFragListBuilder"/>
+      
+      <!-- Configure the weighted fragListBuilder -->
+      <fragListBuilder name="weighted" 
+                       default="true"
+                       class="solr.highlight.WeightedFragListBuilder"/>
+      
+      <!-- default tag FragmentsBuilder -->
+      <fragmentsBuilder name="default" 
+                        default="true"
+                        class="solr.highlight.ScoreOrderFragmentsBuilder">
+        <!-- 
+        <lst name="defaults">
+          <str name="hl.multiValuedSeparatorChar">/</str>
+        </lst>
+        -->
+      </fragmentsBuilder>
+
+      <!-- multi-colored tag FragmentsBuilder -->
+      <fragmentsBuilder name="colored" 
+                        class="solr.highlight.ScoreOrderFragmentsBuilder">
+        <lst name="defaults">
+          <str name="hl.tag.pre"><![CDATA[
+               <b style="background:yellow">,<b style="background:lawgreen">,
+               <b style="background:aquamarine">,<b style="background:magenta">,
+               <b style="background:palegreen">,<b style="background:coral">,
+               <b style="background:wheat">,<b style="background:khaki">,
+               <b style="background:lime">,<b style="background:deepskyblue">]]></str>
+          <str name="hl.tag.post"><![CDATA[</b>]]></str>
+        </lst>
+      </fragmentsBuilder>
+      
+      <boundaryScanner name="default" 
+                       default="true"
+                       class="solr.highlight.SimpleBoundaryScanner">
+        <lst name="defaults">
+          <str name="hl.bs.maxScan">10</str>
+          <str name="hl.bs.chars">.,!? &#9;&#10;&#13;</str>
+        </lst>
+      </boundaryScanner>
+      
+      <boundaryScanner name="breakIterator" 
+                       class="solr.highlight.BreakIteratorBoundaryScanner">
+        <lst name="defaults">
+          <!-- type should be one of CHARACTER, WORD(default), LINE and SENTENCE -->
+          <str name="hl.bs.type">WORD</str>
+          <!-- language and country are used when constructing Locale object.  -->
+          <!-- And the Locale object will be used when getting instance of BreakIterator -->
+          <str name="hl.bs.language">en</str>
+          <str name="hl.bs.country">US</str>
+        </lst>
+      </boundaryScanner>
+    </highlighting>
+  </searchComponent>
+
+  <!-- Update Processors
+
+       Chains of Update Processor Factories for dealing with Update
+       Requests can be declared, and then used by name in Update
+       Request Processors
+
+       http://wiki.apache.org/solr/UpdateRequestProcessor
+
+    --> 
+  <!-- Deduplication
+
+       An example dedup update processor that creates the "id" field
+       on the fly based on the hash code of some other fields.  This
+       example has overwriteDupes set to false since we are using the
+       id field as the signatureField and Solr will maintain
+       uniqueness based on that anyway.  
+       
+    -->
+  <!--
+     <updateRequestProcessorChain name="dedupe">
+       <processor class="solr.processor.SignatureUpdateProcessorFactory">
+         <bool name="enabled">true</bool>
+         <str name="signatureField">id</str>
+         <bool name="overwriteDupes">false</bool>
+         <str name="fields">name,features,cat</str>
+         <str name="signatureClass">solr.processor.Lookup3Signature</str>
+       </processor>
+       <processor class="solr.LogUpdateProcessorFactory" />
+       <processor class="solr.RunUpdateProcessorFactory" />
+     </updateRequestProcessorChain>
+    -->
+  
+  <!-- Language identification
+
+       This example update chain identifies the language of the incoming
+       documents using the langid contrib. The detected language is
+       written to field language_s. No field name mapping is done.
+       The fields used for detection are text, title, subject and description,
+       making this example suitable for detecting languages form full-text
+       rich documents injected via ExtractingRequestHandler.
+       See more about langId at http://wiki.apache.org/solr/LanguageDetection
+    -->
+    <!--
+     <updateRequestProcessorChain name="langid">
+       <processor class="org.apache.solr.update.processor.TikaLanguageIdentifierUpdateProcessorFactory">
+         <str name="langid.fl">text,title,subject,description</str>
+         <str name="langid.langField">language_s</str>
+         <str name="langid.fallback">en</str>
+       </processor>
+       <processor class="solr.LogUpdateProcessorFactory" />
+       <processor class="solr.RunUpdateProcessorFactory" />
+     </updateRequestProcessorChain>
+    -->
+
+  <!-- Script update processor
+
+    This example hooks in an update processor implemented using JavaScript.
+
+    See more about the script update processor at http://wiki.apache.org/solr/ScriptUpdateProcessor
+  -->
+  <!--
+    <updateRequestProcessorChain name="script">
+      <processor class="solr.StatelessScriptUpdateProcessorFactory">
+        <str name="script">update-script.js</str>
+        <lst name="params">
+          <str name="config_param">example config parameter</str>
+        </lst>
+      </processor>
+      <processor class="solr.RunUpdateProcessorFactory" />
+    </updateRequestProcessorChain>
+  -->
+ 
+  <!-- Response Writers
+
+       http://wiki.apache.org/solr/QueryResponseWriter
+
+       Request responses will be written using the writer specified by
+       the 'wt' request parameter matching the name of a registered
+       writer.
+
+       The "default" writer is the default and will be used if 'wt' is
+       not specified in the request.
+    -->
+  <!-- The following response writers are implicitly configured unless
+       overridden...
+    -->
+  <!--
+     <queryResponseWriter name="xml" 
+                          default="true"
+                          class="solr.XMLResponseWriter" />
+     <queryResponseWriter name="json" class="solr.JSONResponseWriter"/>
+     <queryResponseWriter name="python" class="solr.PythonResponseWriter"/>
+     <queryResponseWriter name="ruby" class="solr.RubyResponseWriter"/>
+     <queryResponseWriter name="php" class="solr.PHPResponseWriter"/>
+     <queryResponseWriter name="phps" class="solr.PHPSerializedResponseWriter"/>
+     <queryResponseWriter name="csv" class="solr.CSVResponseWriter"/>
+    -->
+
+  <queryResponseWriter name="json" class="solr.JSONResponseWriter">
+     <!-- For the purposes of the tutorial, JSON responses are written as
+      plain text so that they are easy to read in *any* browser.
+      If you expect a MIME type of "application/json" just remove this override.
+     -->
+    <str name="content-type">text/plain; charset=UTF-8</str>
+  </queryResponseWriter>
+  
+  <!--
+     Custom response writers can be declared as needed...
+    -->
+    <queryResponseWriter name="velocity" class="solr.VelocityResponseWriter" startup="lazy"/>
+  
+
+  <!-- XSLT response writer transforms the XML output by any xslt file found
+       in Solr's conf/xslt directory.  Changes to xslt files are checked for
+       every xsltCacheLifetimeSeconds.  
+    -->
+  <queryResponseWriter name="xslt" class="solr.XSLTResponseWriter">
+    <int name="xsltCacheLifetimeSeconds">5</int>
+  </queryResponseWriter>
+
+  <!-- Query Parsers
+
+       http://wiki.apache.org/solr/SolrQuerySyntax
+
+       Multiple QParserPlugins can be registered by name, and then
+       used in either the "defType" param for the QueryComponent (used
+       by SearchHandler) or in LocalParams
+    -->
+  <!-- example of registering a query parser -->
+  <!--
+     <queryParser name="myparser" class="com.mycompany.MyQParserPlugin"/>
+    -->
+
+  <!-- Function Parsers
+
+       http://wiki.apache.org/solr/FunctionQuery
+
+       Multiple ValueSourceParsers can be registered by name, and then
+       used as function names when using the "func" QParser.
+    -->
+  <!-- example of registering a custom function parser  -->
+  <!--
+     <valueSourceParser name="myfunc" 
+                        class="com.mycompany.MyValueSourceParser" />
+    -->
+    
+  
+  <!-- Document Transformers
+       http://wiki.apache.org/solr/DocTransformers
+    -->
+  <!--
+     Could be something like:
+     <transformer name="db" class="com.mycompany.LoadFromDatabaseTransformer" >
+       <int name="connection">jdbc://....</int>
+     </transformer>
+     
+     To add a constant value to all docs, use:
+     <transformer name="mytrans2" class="org.apache.solr.response.transform.ValueAugmenterFactory" >
+       <int name="value">5</int>
+     </transformer>
+     
+     If you want the user to still be able to change it with _value:something_ use this:
+     <transformer name="mytrans3" class="org.apache.solr.response.transform.ValueAugmenterFactory" >
+       <double name="defaultValue">5</double>
+     </transformer>
+
+      If you are using the QueryElevationComponent, you may wish to mark documents that get boosted.  The
+      EditorialMarkerFactory will do exactly that:
+     <transformer name="qecBooster" class="org.apache.solr.response.transform.EditorialMarkerFactory" />
+    -->
+    
+
+  <!-- Legacy config for the admin interface -->
+  <admin>
+    <defaultQuery>*:*</defaultQuery>
+  </admin>
+
+</config>
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/stopwords.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/stopwords.txt
new file mode 100644
index 0000000..ae1e83e
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/stopwords.txt
@@ -0,0 +1,14 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/synonyms.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/synonyms.txt
new file mode 100644
index 0000000..7f72128
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/solr/collection1/conf/synonyms.txt
@@ -0,0 +1,29 @@
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+#-----------------------------------------------------------------------
+#some test synonym mappings unlikely to appear in real input text
+aaafoo => aaabar
+bbbfoo => bbbfoo bbbbar
+cccfoo => cccbar cccbaz
+fooaaa,baraaa,bazaaa
+
+# Some synonym groups specific to this example
+GB,gib,gigabyte,gigabytes
+MB,mib,megabyte,megabytes
+Television, Televisions, TV, TVs
+#notice we use "gib" instead of "GiB" so any WordDelimiterFilter coming
+#after us won't split it into two words.
+
+# Synonym mappings can be used for spelling correction too
+pixima => pixma
+
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/NullHeader.docx b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/NullHeader.docx
new file mode 100644
index 0000000..cc62b8d
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/NullHeader.docx differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/boilerplate.html b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/boilerplate.html
new file mode 100644
index 0000000..854ebcd
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/boilerplate.html
@@ -0,0 +1,41 @@
+<?xml version="1.0" encoding="utf-8"?>
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
+        "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
+<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
+<head>
+	<meta http-equiv="content-type" content="text/html; charset=utf-8" />
+	<title>Title</title>
+</head>
+<body>
+
+<table>
+	<tr>
+		<td>
+			<table>
+				<tr>
+					<td ><a href="Main.php">boilerplate</a></td>
+					<td ><a href="Main.php">text</a></td>
+				</tr>
+			</table>
+		</td>
+	</tr>
+</table>
+
+<p>This is the real meat of the page, 
+and represents the text we want. 
+It has lots of juicy content.
+
+We assume that it won't get filtered out.
+And that all of the lines will be in the
+output.
+</p>
+
+<p>
+Here's another paragraph of text.
+This is the end of the text.
+</p>
+
+<p><a href="Footer.html">footer</a></p>
+
+</body>
+</html>
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/cars.csv b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/cars.csv
new file mode 100644
index 0000000..8f1f9e1
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/cars.csv
@@ -0,0 +1,6 @@
+Age,Color,Extras,Type,Used
+2,blue,GPS,"Gas, with electric",""
+10,green,"Labeled ""Vintage, 1913""",,yes
+100,red,"Labeled ""Vintage 1913""",yes
+5,orange,none,"This is a
+multi, line text",no
\ No newline at end of file
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/cars.csv.gz b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/cars.csv.gz
new file mode 100644
index 0000000..ee2a951
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/cars.csv.gz differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/cars.ssv b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/cars.ssv
new file mode 100644
index 0000000..54dfd4f
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/cars.ssv
@@ -0,0 +1,6 @@
+Age Color Extras Type Used
+2 blue GPS "Gas, with electric" ""
+10 green "Labeled ""Vintage, 1913"""  yes
+100 red "Labeled ""Vintage 1913""" yes
+5 orange none "This is a
+multi, line text" no
\ No newline at end of file
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/cars.tar.gz b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/cars.tar.gz
new file mode 100644
index 0000000..24128a3
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/cars.tar.gz differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/cars.tsv b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/cars.tsv
new file mode 100644
index 0000000..d4e1b46
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/cars.tsv
@@ -0,0 +1,6 @@
+Age	Color	Extras	Type	Used
+2	blue	GPS	"Gas	 with electric"	""
+10	green	"Labeled ""Vintage	 1913"""		yes
+100	red	"Labeled ""Vintage 1913"""	yes
+5	orange	none	"This is a
+multi	 line text"	no
\ No newline at end of file
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/complex.mbox b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/complex.mbox
new file mode 100644
index 0000000..2aa4828
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/complex.mbox
@@ -0,0 +1,291 @@
+From core-user-return-14700-apmail-hadoop-core-user-archive=hadoop.apache.org@hadoop.apache.org Mon Jun 01 04:28:28 2009
+Return-Path: <core-user-return-14700-apmail-hadoop-core-user-archive=hadoop.apache.org@hadoop.apache.org>
+Delivered-To: apmail-hadoop-core-user-archive@www.apache.org
+Received: (qmail 19921 invoked from network); 1 Jun 2009 04:28:28 -0000
+Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3)
+  by minotaur.apache.org with SMTP; 1 Jun 2009 04:28:28 -0000
+Received: (qmail 84995 invoked by uid 500); 1 Jun 2009 04:28:38 -0000
+Delivered-To: apmail-hadoop-core-user-archive@hadoop.apache.org
+Received: (qmail 84895 invoked by uid 500); 1 Jun 2009 04:28:38 -0000
+Mailing-List: contact core-user-help@hadoop.apache.org; run by ezmlm
+Precedence: bulk
+List-Help: <mailto:core-user-help@hadoop.apache.org>
+List-Unsubscribe: <mailto:core-user-unsubscribe@hadoop.apache.org>
+List-Post: <mailto:core-user@hadoop.apache.org>
+List-Id: <core-user.hadoop.apache.org>
+Reply-To: core-user@hadoop.apache.org
+Delivered-To: mailing list core-user@hadoop.apache.org
+Received: (qmail 84885 invoked by uid 99); 1 Jun 2009 04:28:38 -0000
+Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136)
+    by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Jun 2009 04:28:38 +0000
+X-ASF-Spam-Status: No, hits=1.2 required=10.0
+	tests=SPF_NEUTRAL
+X-Spam-Check-By: apache.org
+Received-SPF: neutral (athena.apache.org: local policy)
+Received: from [69.147.107.21] (HELO mrout2-b.corp.re1.yahoo.com) (69.147.107.21)
+    by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Jun 2009 04:28:26 +0000
+Received: from SNV-EXPF01.ds.corp.yahoo.com (snv-expf01.ds.corp.yahoo.com [207.126.227.250])
+	by mrout2-b.corp.re1.yahoo.com (8.13.8/8.13.8/y.out) with ESMTP id n514QYA6099963
+	for <core-user@hadoop.apache.org>; Sun, 31 May 2009 21:26:35 -0700 (PDT)
+DomainKey-Signature: a=rsa-sha1; s=serpent; d=yahoo-inc.com; c=nofws; q=dns;
+	h=received:user-agent:date:subject:from:to:message-id:
+	thread-topic:thread-index:in-reply-to:mime-version:content-type:
+	content-transfer-encoding:x-originalarrivaltime;
+	b=YVtSNdgjeeSBS1yY3XDolul49i+HrgNG7QszMo9LzGnrwejjgsl5+iUM6EiQgEpV
+Received: from SNV-EXVS08.ds.corp.yahoo.com ([207.126.227.9]) by SNV-EXPF01.ds.corp.yahoo.com with Microsoft SMTPSVC(6.0.3790.3959);
+	 Sun, 31 May 2009 21:26:34 -0700
+Received: from 10.66.92.213 ([10.66.92.213]) by SNV-EXVS08.ds.corp.yahoo.com ([207.126.227.58]) with Microsoft Exchange Server HTTP-DAV ;
+ Mon,  1 Jun 2009 04:26:33 +0000
+User-Agent: Microsoft-Entourage/12.17.0.090302
+Date: Mon, 01 Jun 2009 09:56:31 +0530
+Subject: Re: question about when shuffle/sort start working
+From: Jothi Padmanabhan <jothipn@yahoo-inc.com>
+To: <core-user@hadoop.apache.org>
+Message-ID: <C649564F.1435F%jothipn@yahoo-inc.com>
+Thread-Topic: question about when shuffle/sort start working
+Thread-Index: AcnicSNoBw19cMU8UEaXwAdZ1YYhuw==
+In-Reply-To: <440622.41041.qm@web111005.mail.gq1.yahoo.com>
+Mime-version: 1.0
+Content-type: text/plain;
+	charset="US-ASCII"
+Content-transfer-encoding: 7bit
+X-OriginalArrivalTime: 01 Jun 2009 04:26:34.0501 (UTC) FILETIME=[257EAB50:01C9E271]
+X-Virus-Checked: Checked by ClamAV on apache.org
+
+When a Mapper completes, MapCompletionEvents are generated. Reducers try to
+fetch map outputs for a given map only on the receipt of such events.
+
+Jothi
+
+
+On 5/30/09 10:00 AM, "Jianmin Woo" <jianmin_woo@yahoo.com> wrote:
+
+> Hi, 
+> I am being confused by the protocol between mapper and reducer. When mapper
+> emitting the (key,value) pair done, is there any signal the mapper send out to
+> hadoop framework in protocol to indicate that map is done and the shuffle/sort
+> can begin for reducer? If there is no this signal in protocol, when the
+> framework begin the shuffle/sort?
+> 
+> Thanks,
+> Jianmin
+> 
+> 
+> 
+>       
+
+
+From core-user-return-14701-apmail-hadoop-core-user-archive=hadoop.apache.org@hadoop.apache.org Mon Jun 01 05:31:14 2009
+Return-Path: <core-user-return-14701-apmail-hadoop-core-user-archive=hadoop.apache.org@hadoop.apache.org>
+Delivered-To: apmail-hadoop-core-user-archive@www.apache.org
+Received: (qmail 38243 invoked from network); 1 Jun 2009 05:31:14 -0000
+Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3)
+  by minotaur.apache.org with SMTP; 1 Jun 2009 05:31:14 -0000
+Received: (qmail 15621 invoked by uid 500); 1 Jun 2009 05:31:24 -0000
+Delivered-To: apmail-hadoop-core-user-archive@hadoop.apache.org
+Received: (qmail 15557 invoked by uid 500); 1 Jun 2009 05:31:24 -0000
+Mailing-List: contact core-user-help@hadoop.apache.org; run by ezmlm
+Precedence: bulk
+List-Help: <mailto:core-user-help@hadoop.apache.org>
+List-Unsubscribe: <mailto:core-user-unsubscribe@hadoop.apache.org>
+List-Post: <mailto:core-user@hadoop.apache.org>
+List-Id: <core-user.hadoop.apache.org>
+Reply-To: core-user@hadoop.apache.org
+Delivered-To: mailing list core-user@hadoop.apache.org
+Received: (qmail 15547 invoked by uid 99); 1 Jun 2009 05:31:24 -0000
+Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230)
+    by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Jun 2009 05:31:24 +0000
+X-ASF-Spam-Status: No, hits=2.2 required=10.0
+	tests=HTML_MESSAGE,SPF_PASS
+X-Spam-Check-By: apache.org
+Received-SPF: pass (nike.apache.org: local policy)
+Received: from [68.142.237.94] (HELO n9.bullet.re3.yahoo.com) (68.142.237.94)
+    by apache.org (qpsmtpd/0.29) with SMTP; Mon, 01 Jun 2009 05:31:11 +0000
+Received: from [68.142.237.88] by n9.bullet.re3.yahoo.com with NNFMP; 01 Jun 2009 05:30:50 -0000
+Received: from [67.195.9.82] by t4.bullet.re3.yahoo.com with NNFMP; 01 Jun 2009 05:30:49 -0000
+Received: from [67.195.9.99] by t2.bullet.mail.gq1.yahoo.com with NNFMP; 01 Jun 2009 05:30:49 -0000
+Received: from [127.0.0.1] by omp103.mail.gq1.yahoo.com with NNFMP; 01 Jun 2009 05:28:01 -0000
+X-Yahoo-Newman-Property: ymail-3
+X-Yahoo-Newman-Id: 796121.97519.bm@omp103.mail.gq1.yahoo.com
+Received: (qmail 35264 invoked by uid 60001); 1 Jun 2009 05:30:49 -0000
+DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1243834249; bh=R8qzdi/IbLyO8UwpnaujDpT9E+6bJ7nkmZN2803EmRk=; h=Message-ID:X-YMail-OSG:Received:X-Mailer:References:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=vq4c6RIDbkuLPYd8mirusIXf6DqTb/IeT55In7W00Y5Sxx1ZiXBb78yE9+TDfXJ0elsEZvqv4ocyvolGE0eGtyYeJA0mZikpRNu6pidxPNpCplOcLHBRz7YQ7iERwv3TagRlWy2Xd3oD9ZeV0A05P7WUOiNNX1PUUJD1IVdrEZo=
+DomainKey-Signature:a=rsa-sha1; q=dns; c=nofws;
+  s=s1024; d=yahoo.com;
+  h=Message-ID:X-YMail-OSG:Received:X-Mailer:References:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type;
+  b=6HXZV98ON5vBwmE/xS8stVD0D2F4dkMY7a0suX5KVTb736JdR8G59mqBq/dWcpbFTLiCLtxi18LMb/dU1RKRGOEdn3l3j/jKXhBrhIgfg3qtNskPedXDKBvn7JGXiSkqpA/tUtPjvc0Uuk8/LaA01SQTz40Engg7nD8/EJdIAhA=;
+Message-ID: <592088.35091.qm@web111010.mail.gq1.yahoo.com>
+X-YMail-OSG: KzhhrJYVM1m.MCS6vRpRP2ZZO2PrfnbngosELDCIa91ZqvhJph4RdmzfUW0jw9W04RCSch1K730bPohwNpNBIk2QR_zt4_mfbhfq7YEPkSoz9LSXG90P9vIo5Fc8qyZN0U6vA9gtdyGQTpN5ahvillUH9nAF0TMWv2SvZJLjPlQ0Z0p8oK8ltBwGTgLrM8Jtdn9D29yoRyi3_EpVOfdD9OP.EK50Vr1XwSUYMbnpZ0WGHMwd.Yig7A6Elwadm3YVbfOdx2mfrG.jQsUAxQjRBNvbrOM57.FaE11kHTe9aoBWSeihNg--
+Received: from [216.145.54.7] by web111010.mail.gq1.yahoo.com via HTTP; Sun, 31 May 2009 22:30:49 PDT
+X-Mailer: YahooMailRC/1277.43 YahooMailWebService/0.7.289.10
+References: <C649564F.1435F%jothipn@yahoo-inc.com>
+Date: Sun, 31 May 2009 22:30:49 -0700 (PDT)
+From: Jianmin Woo <jianmin_woo@yahoo.com>
+Subject: Re: question about when shuffle/sort start working
+To: core-user@hadoop.apache.org
+In-Reply-To: <C649564F.1435F%jothipn@yahoo-inc.com>
+MIME-Version: 1.0
+Content-Type: multipart/alternative; boundary="0-1193839393-1243834249=:35091"
+X-Virus-Checked: Checked by ClamAV on apache.org
+
+--0-1193839393-1243834249=:35091
+Content-Type: text/plain; charset=us-ascii
+
+Thanks a lot for your explanation, Jothi. 
+
+So is this event generated by hadoop framework? Is there any API in mapper to fire this event? Actually, I am thinking to implement a mapper that will emit some <key, value> pairs, then fire this event to let the reducer works, the same mapper task then emit some other <key, value> pairs and repeat. Do you think is this logic feasible by current API?
+
+Thanks,
+Jianmin
+
+
+
+
+
+________________________________
+From: Jothi Padmanabhan <jothipn@yahoo-inc.com>
+To: core-user@hadoop.apache.org
+Sent: Monday, June 1, 2009 12:26:31 PM
+Subject: Re: question about when shuffle/sort start working
+
+When a Mapper completes, MapCompletionEvents are generated. Reducers try to
+fetch map outputs for a given map only on the receipt of such events.
+
+Jothi
+
+
+On 5/30/09 10:00 AM, "Jianmin Woo" <jianmin_woo@yahoo.com> wrote:
+
+> Hi, 
+> I am being confused by the protocol between mapper and reducer. When mapper
+> emitting the (key,value) pair done, is there any signal the mapper send out to
+> hadoop framework in protocol to indicate that map is done and the shuffle/sort
+> can begin for reducer? If there is no this signal in protocol, when the
+> framework begin the shuffle/sort?
+> 
+> Thanks,
+> Jianmin
+> 
+> 
+> 
+>      
+
+
+      
+--0-1193839393-1243834249=:35091--
+
+
+From core-user-return-14702-apmail-hadoop-core-user-archive=hadoop.apache.org@hadoop.apache.org Mon Jun 01 06:04:30 2009
+Return-Path: <core-user-return-14702-apmail-hadoop-core-user-archive=hadoop.apache.org@hadoop.apache.org>
+Delivered-To: apmail-hadoop-core-user-archive@www.apache.org
+Received: (qmail 53387 invoked from network); 1 Jun 2009 06:04:29 -0000
+Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3)
+  by minotaur.apache.org with SMTP; 1 Jun 2009 06:04:29 -0000
+Received: (qmail 39066 invoked by uid 500); 1 Jun 2009 06:04:39 -0000
+Delivered-To: apmail-hadoop-core-user-archive@hadoop.apache.org
+Received: (qmail 38970 invoked by uid 500); 1 Jun 2009 06:04:39 -0000
+Mailing-List: contact core-user-help@hadoop.apache.org; run by ezmlm
+Precedence: bulk
+List-Help: <mailto:core-user-help@hadoop.apache.org>
+List-Unsubscribe: <mailto:core-user-unsubscribe@hadoop.apache.org>
+List-Post: <mailto:core-user@hadoop.apache.org>
+List-Id: <core-user.hadoop.apache.org>
+Reply-To: core-user@hadoop.apache.org
+Delivered-To: mailing list core-user@hadoop.apache.org
+Received: (qmail 38955 invoked by uid 99); 1 Jun 2009 06:04:39 -0000
+Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136)
+    by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Jun 2009 06:04:39 +0000
+X-ASF-Spam-Status: No, hits=1.2 required=10.0
+	tests=SPF_NEUTRAL
+X-Spam-Check-By: apache.org
+Received-SPF: neutral (athena.apache.org: local policy)
+Received: from [216.145.54.172] (HELO mrout2.yahoo.com) (216.145.54.172)
+    by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Jun 2009 06:04:28 +0000
+Received: from SNV-EXBH01.ds.corp.yahoo.com (snv-exbh01.ds.corp.yahoo.com [207.126.227.249])
+	by mrout2.yahoo.com (8.13.6/8.13.6/y.out) with ESMTP id n5163FGq038852
+	for <core-user@hadoop.apache.org>; Sun, 31 May 2009 23:03:15 -0700 (PDT)
+DomainKey-Signature: a=rsa-sha1; s=serpent; d=yahoo-inc.com; c=nofws; q=dns;
+	h=received:user-agent:date:subject:from:to:message-id:
+	thread-topic:thread-index:in-reply-to:mime-version:content-type:
+	content-transfer-encoding:x-originalarrivaltime;
+	b=rChE4SCnwtWaZpjhovkiXDKfDiVNdRRvsadSGG9S9bgvOexn/9/5JjEQx1pOR7Nb
+Received: from SNV-EXVS08.ds.corp.yahoo.com ([207.126.227.9]) by SNV-EXBH01.ds.corp.yahoo.com with Microsoft SMTPSVC(6.0.3790.3959);
+	 Sun, 31 May 2009 23:03:15 -0700
+Received: from 10.66.92.213 ([10.66.92.213]) by SNV-EXVS08.ds.corp.yahoo.com ([207.126.227.58]) with Microsoft Exchange Server HTTP-DAV ;
+ Mon,  1 Jun 2009 06:03:15 +0000
+User-Agent: Microsoft-Entourage/12.17.0.090302
+Date: Mon, 01 Jun 2009 11:33:13 +0530
+Subject: Re: question about when shuffle/sort start working
+From: Jothi Padmanabhan <jothipn@yahoo-inc.com>
+To: <core-user@hadoop.apache.org>
+Message-ID: <C6496CF9.1437C%jothipn@yahoo-inc.com>
+Thread-Topic: question about when shuffle/sort start working
+Thread-Index: AcnifqWrLG6N7GAk7kqy9QalVWfegQ==
+In-Reply-To: <592088.35091.qm@web111010.mail.gq1.yahoo.com>
+Mime-version: 1.0
+Content-type: text/plain;
+	charset="US-ASCII"
+Content-transfer-encoding: 7bit
+X-OriginalArrivalTime: 01 Jun 2009 06:03:15.0462 (UTC) FILETIME=[A7231260:01C9E27E]
+X-Virus-Checked: Checked by ClamAV on apache.org
+
+
+No you cannot raise this event yourself, this event is generated internally
+by the framework. 
+
+I am guessing that what you probably want is to have a chain of MapReduce
+Jobs where the output of one is automatically fed as input to another.  You
+can look at these classes: JobControl and ChainMapper/ChainReducer.
+
+Jothi
+
+On 6/1/09 11:00 AM, "Jianmin Woo" <jianmin_woo@yahoo.com> wrote:
+
+> Thanks a lot for your explanation, Jothi.
+> 
+> So is this event generated by hadoop framework? Is there any API in mapper to
+> fire this event? Actually, I am thinking to implement a mapper that will emit
+> some <key, value> pairs, then fire this event to let the reducer works, the
+> same mapper task then emit some other <key, value> pairs and repeat. Do you
+> think is this logic feasible by current API?
+> 
+> Thanks,
+> Jianmin
+> 
+> 
+> 
+> 
+> 
+> ________________________________
+> From: Jothi Padmanabhan <jothipn@yahoo-inc.com>
+> To: core-user@hadoop.apache.org
+> Sent: Monday, June 1, 2009 12:26:31 PM
+> Subject: Re: question about when shuffle/sort start working
+> 
+> When a Mapper completes, MapCompletionEvents are generated. Reducers try to
+> fetch map outputs for a given map only on the receipt of such events.
+> 
+> Jothi
+> 
+> 
+> On 5/30/09 10:00 AM, "Jianmin Woo" <jianmin_woo@yahoo.com> wrote:
+> 
+>> Hi, 
+>> I am being confused by the protocol between mapper and reducer. When mapper
+>> emitting the (key,value) pair done, is there any signal the mapper send out
+>> to
+>> hadoop framework in protocol to indicate that map is done and the
+>> shuffle/sort
+>> can begin for reducer? If there is no this signal in protocol, when the
+>> framework begin the shuffle/sort?
+>> 
+>> Thanks,
+>> Jianmin
+>> 
+>> 
+>> 
+>>      
+> 
+> 
+>       
+
+
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/multiline-sessions.log b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/multiline-sessions.log
new file mode 100644
index 0000000..633a15c
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/multiline-sessions.log
@@ -0,0 +1,9 @@
+Started GET /foo
+  Foo Started GET as HTML
+Completed 401 Unauthorized in 0ms
+
+
+Started GET /bar
+  Bar as HTML
+Completed 200 OK in 339ms
+Started GET /baz
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/multiline-stacktrace-expected-long-event.log b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/multiline-stacktrace-expected-long-event.log
new file mode 100644
index 0000000..419f799
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/multiline-stacktrace-expected-long-event.log
@@ -0,0 +1,25 @@
+Grave: Timer task com.base2services.jenkins.SqsQueueHandler@32eea79d failed
+com.amazonaws.AmazonClientException: Unable to calculate a request signature: Unable to calculate a request signature: Empty key
+  at com.amazonaws.auth.AbstractAWSSigner.signAndBase64Encode(AbstractAWSSigner.java:71)
+  at com.amazonaws.auth.AbstractAWSSigner.signAndBase64Encode(AbstractAWSSigner.java:55)
+  at com.amazonaws.auth.QueryStringSigner.sign(QueryStringSigner.java:83)
+  at com.amazonaws.auth.QueryStringSigner.sign(QueryStringSigner.java:46)
+  at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:238)
+  at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:170)
+  at com.amazonaws.services.sqs.AmazonSQSClient.invoke(AmazonSQSClient.java:776)
+  at com.amazonaws.services.sqs.AmazonSQSClient.listQueues(AmazonSQSClient.java:564)
+  at com.amazonaws.services.sqs.AmazonSQSClient.listQueues(AmazonSQSClient.java:732)
+  at com.base2services.jenkins.SqsProfile.createQueue(SqsProfile.java:72)
+  at com.base2services.jenkins.SqsProfile.getQueueUrl(SqsProfile.java:62)
+  at com.base2services.jenkins.SqsQueueHandler.doRun(SqsQueueHandler.java:37)
+  at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:54)
+  at java.util.TimerThread.mainLoop(Timer.java:555)
+  at java.util.TimerThread.run(Timer.java:505)
+Caused by: com.amazonaws.AmazonClientException: Unable to calculate a request signature: Empty key
+  at com.amazonaws.auth.AbstractAWSSigner.sign(AbstractAWSSigner.java:90)
+  at com.amazonaws.auth.AbstractAWSSigner.signAndBase64Encode(AbstractAWSSigner.java:68)
+  ... 14 more
+Caused by: java.lang.IllegalArgumentException: Empty key
+  at javax.crypto.spec.SecretKeySpec.<init>(SecretKeySpec.java:96)
+  at com.amazonaws.auth.AbstractAWSSigner.sign(AbstractAWSSigner.java:87)
+  ... 15 more
\ No newline at end of file
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/multiline-stacktrace.log b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/multiline-stacktrace.log
new file mode 100644
index 0000000..b860012
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/multiline-stacktrace.log
@@ -0,0 +1,30 @@
+juil. 25, 2012 10:49:46 AM hudson.triggers.SafeTimerTask run
+Grave: Timer task com.base2services.jenkins.SqsQueueHandler@32eea79d failed
+com.amazonaws.AmazonClientException: Unable to calculate a request signature: Unable to calculate a request signature: Empty key
+  at com.amazonaws.auth.AbstractAWSSigner.signAndBase64Encode(AbstractAWSSigner.java:71)
+  at com.amazonaws.auth.AbstractAWSSigner.signAndBase64Encode(AbstractAWSSigner.java:55)
+  at com.amazonaws.auth.QueryStringSigner.sign(QueryStringSigner.java:83)
+  at com.amazonaws.auth.QueryStringSigner.sign(QueryStringSigner.java:46)
+  at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:238)
+  at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:170)
+  at com.amazonaws.services.sqs.AmazonSQSClient.invoke(AmazonSQSClient.java:776)
+  at com.amazonaws.services.sqs.AmazonSQSClient.listQueues(AmazonSQSClient.java:564)
+  at com.amazonaws.services.sqs.AmazonSQSClient.listQueues(AmazonSQSClient.java:732)
+  at com.base2services.jenkins.SqsProfile.createQueue(SqsProfile.java:72)
+  at com.base2services.jenkins.SqsProfile.getQueueUrl(SqsProfile.java:62)
+  at com.base2services.jenkins.SqsQueueHandler.doRun(SqsQueueHandler.java:37)
+  at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:54)
+  at java.util.TimerThread.mainLoop(Timer.java:555)
+  at java.util.TimerThread.run(Timer.java:505)
+Caused by: com.amazonaws.AmazonClientException: Unable to calculate a request signature: Empty key
+  at com.amazonaws.auth.AbstractAWSSigner.sign(AbstractAWSSigner.java:90)
+  at com.amazonaws.auth.AbstractAWSSigner.signAndBase64Encode(AbstractAWSSigner.java:68)
+  ... 14 more
+Caused by: java.lang.IllegalArgumentException: Empty key
+  at javax.crypto.spec.SecretKeySpec.<init>(SecretKeySpec.java:96)
+  at com.amazonaws.auth.AbstractAWSSigner.sign(AbstractAWSSigner.java:87)
+  ... 15 more
+
+
+juil. 25, 2012 10:49:54 AM hudson.slaves.SlaveComputer tryReconnect
+Infos: Attempting to reconnect CentosVagrant
\ No newline at end of file
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/non-length-delimited-20130430-234145-tweets.json.gz b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/non-length-delimited-20130430-234145-tweets.json.gz
new file mode 100644
index 0000000..e5c0f7a
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/non-length-delimited-20130430-234145-tweets.json.gz differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/rsstest.rss b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/rsstest.rss
new file mode 100644
index 0000000..758f6a1
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/rsstest.rss
@@ -0,0 +1,36 @@
+<?xml version="1.0" encoding="ISO-8859-1" ?>
+<!--
+	Licensed to the Apache Software Foundation (ASF) under one or more
+	contributor license agreements.  See the NOTICE file distributed with
+	this work for additional information regarding copyright ownership.
+	The ASF licenses this file to You under the Apache License, Version 2.0
+	(the "License"); you may not use this file except in compliance with
+	the License.  You may obtain a copy of the License at
+	
+	http://www.apache.org/licenses/LICENSE-2.0
+	
+	Unless required by applicable law or agreed to in writing, software
+	distributed under the License is distributed on an "AS IS" BASIS,
+	WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+	See the License for the specific language governing permissions and
+	limitations under the License.
+-->
+<rss version="0.91">
+    <channel>
+      <title>TestChannel</title>
+      <link>http://test.channel.com/</link> 
+      <description>Sample RSS File for Junit test</description> 
+      <language>en-us</language>
+      
+      <item>
+        <title>Home Page of Chris Mattmann</title>
+        <link>http://www-scf.usc.edu/~mattmann/</link>
+        <description>Chris Mattmann's home page</description>
+      </item>
+      <item>
+        <title>Awesome Open Source Search Engine</title> 
+        <link>http://www.nutch.org/</link> 
+        <description>Yup, that's what it is</description> 
+      </item>
+   </channel>
+</rss>
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/sample-statuses-20120906-141433 b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/sample-statuses-20120906-141433
new file mode 100644
index 0000000..e633a1f
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/sample-statuses-20120906-141433
@@ -0,0 +1,4 @@
+1000
+{"text":"sample tweet one","retweet_count":0,"in_reply_to_user_id":null,"retweeted":false,"truncated":false,"source":"href=\"http:\/\/sample.com\"","id_str":"1234567891","entities":{"user_mentions":[],"hashtags":[],"urls":[]},"in_reply_to_status_id":null,"place":null,"in_reply_to_status_id_str":null,"coordinates":null,"created_at":"Wed Sep 05 01:01:01 +0000 1985","in_reply_to_screen_name":null,"favorited":false,"in_reply_to_user_id_str":null,"user":{"default_profile_image":false,"friends_count":111,"profile_background_color":"3C0C29","location":"Palo Alto","is_translator":false,"profile_background_tile":true,"favourites_count":11,"verified":false,"profile_sidebar_fill_color":"efefef","follow_request_sent":null,"contributors_enabled":false,"description":"desc1","profile_sidebar_border_color":"eeeeee","profile_image_url_https":"https:\/\/si0.twimg.com\/profile_images\/1\/normal.jpg","id_str":"1111111","listed_count":1,"lang":"en","screen_name":"fake_user1","show_all_inline_media":false,"profile_use_background_image":true,"profile_image_url":"http:\/\/a0.twimg.com\/profile_images\/1111111\/normal.jpg","default_profile":false,"statuses_count":11111,"created_at":"Thu Apr 07 11:04:54 +0000 1985","profile_text_color":"333333","followers_count":111,"protected":false,"following":null,"notifications":null,"profile_background_image_url":"http:\/\/a0.twimg.com\/images\/themes\/theme1\/bg.gif","time_zone":null,"url":null,"name":"name1","geo_enabled":false,"profile_link_color":"009999","id":1111112,"profile_background_image_url_https":"https:\/\/si0.twimg.com\/images\/themes\/theme1\/bg.gif","utc_offset":null},"id":11111112,"contributors":null,"geo":null}
+2000
+{"text":"sample tweet two","retweet_count":0,"in_reply_to_user_id":null,"retweeted":false,"truncated":false,"source":"href=\"http:\/\/sample.com\"","id_str":"2345678902","entities":{"user_mentions":[],"hashtags":[],"urls":[]},"in_reply_to_status_id":null,"place":null,"in_reply_to_status_id_str":null,"coordinates":null,"created_at":"Wed Sep 05 02:14:34 +0000 1985","in_reply_to_screen_name":null,"favorited":false,"in_reply_to_user_id_str":null,"user":{"default_profile_image":false,"friends_count":222,"profile_background_color":"3C0C29","location":"San Francisco","is_translator":false,"profile_background_tile":false,"favourites_count":22,"verified":false,"profile_sidebar_fill_color":"B2D948","follow_request_sent":null,"contributors_enabled":false,"description":"desc2","profile_sidebar_border_color":"8EC63D","profile_image_url_https":"https:\/\/si0.twimg.com\/profile_images\/22222222\/image_normal.jpg","id_str":"2222222","listed_count":0,"lang":"en","screen_name":"fake_user2","show_all_inline_media":false,"profile_use_background_image":true,"profile_image_url":"http:\/\/a0.twimg.com\/profile_images\/2222222\/image_normal.jpg","default_profile":false,"statuses_count":222222,"created_at":"Thu Aug 04 11:33:28 +0000 1985","profile_text_color":"444444","followers_count":222,"protected":false,"following":null,"notifications":null,"profile_background_image_url":"http:\/\/a0.twimg.com\/profile_background_images\/222222\/222222.jpg","time_zone":"Central Time (US & Canada)","url":null,"name":"name2","geo_enabled":false,"profile_link_color":"9A0057","id":2222223,"profile_background_image_url_https":"https:\/\/si0.twimg.com\/profile_background_images\/2222222\/22222.jpg","utc_offset":-21600},"id":222223,"contributors":null,"geo":null}
\ No newline at end of file
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/sample-statuses-20120906-141433-medium.avro b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/sample-statuses-20120906-141433-medium.avro
new file mode 100644
index 0000000..900507c
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/sample-statuses-20120906-141433-medium.avro differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/sample-statuses-20120906-141433-subschema.avsc b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/sample-statuses-20120906-141433-subschema.avsc
new file mode 100644
index 0000000..cd64717
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/sample-statuses-20120906-141433-subschema.avsc
@@ -0,0 +1,12 @@
+{
+  "type" : "record",
+  "name" : "Doc",
+  "doc" : "adoc",
+  "fields" : [ {
+    "name" : "id",
+    "type" : "string"
+  }, {
+    "name" : "text",
+    "type" : [ "string", "null" ]
+  } ]
+}
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/sample-statuses-20120906-141433.avro b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/sample-statuses-20120906-141433.avro
new file mode 100644
index 0000000..4dbf180
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/sample-statuses-20120906-141433.avro differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/sample-statuses-20120906-141433.avsc b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/sample-statuses-20120906-141433.avsc
new file mode 100644
index 0000000..9e4529f
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/sample-statuses-20120906-141433.avsc
@@ -0,0 +1,57 @@
+{
+  "type" : "record",
+  "name" : "Doc",
+  "doc" : "adoc",
+  "fields" : [ {
+    "name" : "id",
+    "type" : "string"
+  }, {
+    "name" : "user_friends_count",
+    "type" : [ "int", "null" ]
+  }, {
+    "name" : "user_location",
+    "type" : [ "string", "null" ]
+  }, {
+    "name" : "user_description",
+    "type" : [ "string", "null" ]
+  }, {
+    "name" : "user_statuses_count",
+    "type" : [ "int", "null" ]
+  }, {
+    "name" : "user_followers_count",
+    "type" : [ "int", "null" ]
+  }, {
+    "name" : "user_name",
+    "type" : [ "string", "null" ]
+  }, {
+    "name" : "user_screen_name",
+    "type" : [ "string", "null" ]
+  }, {
+    "name" : "created_at",
+    "type" : [ "string", "null" ]
+  }, {
+    "name" : "text",
+    "type" : [ "string", "null" ]
+  }, {
+    "name" : "retweet_count",
+    "type" : [ "int", "null" ]
+  }, {
+    "name" : "retweeted",
+    "type" : [ "boolean", "null" ]
+  }, {
+    "name" : "in_reply_to_user_id",
+    "type" : [ "long", "null" ]
+  }, {
+    "name" : "source",
+    "type" : [ "string", "null" ]
+  }, {
+    "name" : "in_reply_to_status_id",
+    "type" : [ "long", "null" ]
+  }, {
+    "name" : "media_url_https",
+    "type" : [ "string", "null" ]
+  }, {
+    "name" : "expanded_url",
+    "type" : [ "string", "null" ]
+  } ]
+}
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/sample-statuses-20120906-141433.bz2 b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/sample-statuses-20120906-141433.bz2
new file mode 100644
index 0000000..a4a9159
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/sample-statuses-20120906-141433.bz2 differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/sample-statuses-20120906-141433.gz b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/sample-statuses-20120906-141433.gz
new file mode 100644
index 0000000..3e7a44c
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/sample-statuses-20120906-141433.gz differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/test-documents.7z b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/test-documents.7z
new file mode 100644
index 0000000..94d62d3
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/test-documents.7z differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/test-documents.cpio b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/test-documents.cpio
new file mode 100644
index 0000000..c13a1cb
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/test-documents.cpio differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/test-documents.tar b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/test-documents.tar
new file mode 100644
index 0000000..3076a58
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/test-documents.tar differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/test-documents.tbz2 b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/test-documents.tbz2
new file mode 100644
index 0000000..21488d3
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/test-documents.tbz2 differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/test-documents.tgz b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/test-documents.tgz
new file mode 100644
index 0000000..baca6bb
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/test-documents.tgz differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/test-documents.zip b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/test-documents.zip
new file mode 100644
index 0000000..27d600d
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/test-documents.zip differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/test-outlook.msg b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/test-outlook.msg
new file mode 100644
index 0000000..c975c0c
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/test-outlook.msg differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/test-zip-of-zip.zip b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/test-zip-of-zip.zip
new file mode 100644
index 0000000..f6b3edc
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/test-zip-of-zip.zip differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testAIFF.aif b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testAIFF.aif
new file mode 100644
index 0000000..97eac1d
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testAIFF.aif differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testBMP.bmp b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testBMP.bmp
new file mode 100644
index 0000000..c017615
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testBMP.bmp differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testBMPfp.txt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testBMPfp.txt
new file mode 100644
index 0000000..1da2966
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testBMPfp.txt
@@ -0,0 +1,3 @@
+BMW to Make Hybrid Sports Car 
+ By CHRISTOPH RAUWALD . 
+LEIPZIG, Germany—German car maker BMW AG said Friday it will start series production of a new plug-in hybrid sports car in 2013, to be based on the Vision EfficientDynamics Concept car shown at the Frankfurt auto show in September last year. Chief Executive Norbert Reithofer said the car will be produced in Germany but didn't provide details on the price. The BMW Vision EfficientDynamics Concept car is a sporty plug-in, full hybrid with a turbo-diesel engine, four seats and upward-pivoting doors. BMW executive board member Klaus Draeger told reporters he expects to achieve "a significant sales volume" with the new high-performance sports car. Asked whether annual sales could exceed 1,000 vehicles, Mr. Draeger said, "You said this and I'm not saying this is wrong." In March, Mr. Reithofer indicated that the concept car was set to make it into series production. "I like the car. And you know what it means when I say I like the car—it means I will drive it. It's not just a concept car," he told analysts during a presentation in Munich. The car will be designed for sale in all major global markets, which according to Mr. Draeger might require offering a gasoline engine instead of the prototype's three-cylinder diesel engine. Diesel cars account for roughly half of the European market, but are significantly less popular in the U.S. and hardly present at all in China. Mr. Draeger declined to comment on the vehicle's price tag, but noted that in order to achieve substantial sales volumes the price mustn't be too high. He said the same goes for BMW's planned Megacity Vehicle. A price tag of €60,000 ($85,242) or more would certainly limit potential sales volumes, he said. 
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testDITA.dita b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testDITA.dita
new file mode 100644
index 0000000..b68da7b
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testDITA.dita
@@ -0,0 +1,34 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE task PUBLIC "-//OASIS//DTD DITA Task//EN" "task.dtd">
+<task id="apache-tika">
+    <title>Apache Tika</title>
+    <shortdesc>Apache Tika - a content analysis toolkit.</shortdesc>
+    <prolog>
+        <author>Apache Software Foundation</author>
+        <copyright>
+            <copyryear year="2011"/>
+            <copyrholder>Apache Software Foundation</copyrholder>
+        </copyright>
+        <metadata>
+            <audience experiencelevel="expert" job="Customizing" type="Coder"/>
+            <category>Metadata</category>
+            <keywords>
+                <keyword>Tika</keyword>
+                <keyword>Content</keyword>
+            </keywords>
+            <prodinfo>
+                <prodname>Apache Tika</prodname>
+                <vrmlist>
+                    <vrm version="1.x" release="Final" modification="2011/11/11"/>
+                </vrmlist>
+            </prodinfo>
+        </metadata>
+    </prolog>
+    <taskbody>
+        <context>
+            <p>The Apache Tika toolkit detects and extracts metadata and structured text content from various documents using existing parser libraries. You can find the latest release on the download page. See the Getting Started guide for instructions on how to start using Tika.</p>
+
+            <p>Tika is a project of the Apache Software Foundation, and was formerly a subproject of Apache Lucene.</p>
+        </context>
+    </taskbody>
+</task>
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testEMLX.emlx b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testEMLX.emlx
new file mode 100644
index 0000000..d9a7126
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testEMLX.emlx
@@ -0,0 +1,55 @@
+1795
+From: "Julien Nioche (JIRA)" <jira@apache.org>
+To: dev@tika.apache.org
+Subject: [jira] Commented: (TIKA-461) RFC822 messages not parsed
+Reply-To: dev@tika.apache.org
+Delivered-To: mailing list dev@tika.apache.org
+Date: Mon, 6 Sep 2010 05:25:34 -0400 (EDT)
+In-Reply-To: <6089099.260231278600349994.JavaMail.jira@thor>
+MIME-Version: 1.0
+Content-Type: text/plain; charset=utf-8
+Content-Transfer-Encoding: 7bit
+X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394
+X-Virus-Checked: Checked by ClamAV on apache.org
+
+
+    [ https://issues.apache.org/jira/browse/TIKA-461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12906468#action_12906468 ] 
+
+Julien Nioche commented on TIKA-461:
+------------------------------------
+
+I'll have a look at mime4j and try to use it in Tika
+
+> RFC822 messages not parsed
+> --------------------------
+>
+>                 Key: TIKA-461
+>                 URL: https://issues.apache.org/jira/browse/TIKA-461
+>             Project: Tika
+>          Issue Type: Bug
+>          Components: parser
+>    Affects Versions: 0.7
+>            Reporter: Joshua Turner
+>            Assignee: Julien Nioche
+>
+> Presented with an RFC822 message exported from Thunderbird, AutodetectParser produces an empty body, and a Metadata containing only one key-value pair: "Content-Type=message/rfc822". Directly calling MboxParser likewise gives an empty body, but with two metadata pairs: "Content-Encoding=us-ascii Content-Type=application/mbox".
+> A quick peek at the source of MboxParser shows that the implementation is pretty naive. If the wiring can be sorted out, something like Apache James' mime4j might be a better bet.
+
+-- 
+This message is automatically generated by JIRA.
+-
+You can reply to this email to add a comment to the issue online.
+
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
+<plist version="1.0">
+<dict>
+	<key>flags</key>
+	<integer>0</integer>
+	<key>sender</key>
+	<string>"Julien Nioche (JIRA)" &lt;jira@apache.org&gt;</string>
+	<key>subject</key>
+	<string>[jira] Commented: (TIKA-461) RFC822 messages not parsed</string>
+	<key>to</key>
+	<string>dev@tika.apache.org</string></dict>
+</plist>
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testEXCEL.xls b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testEXCEL.xls
new file mode 100644
index 0000000..86b2916
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testEXCEL.xls differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testEXCEL.xlsx b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testEXCEL.xlsx
new file mode 100644
index 0000000..8d5169f
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testEXCEL.xlsx differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testFLAC.flac b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testFLAC.flac
new file mode 100644
index 0000000..ccec947
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testFLAC.flac differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testFLAC.oga b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testFLAC.oga
new file mode 100644
index 0000000..37a1247
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testFLAC.oga differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testFLV.flv b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testFLV.flv
new file mode 100644
index 0000000..d35e9bb
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testFLV.flv differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testGIF.gif b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testGIF.gif
new file mode 100644
index 0000000..e09e641
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testGIF.gif differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testJAR.jar b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testJAR.jar
new file mode 100644
index 0000000..4677a62
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testJAR.jar differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testJPEG_EXIF.jpg b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testJPEG_EXIF.jpg
new file mode 100644
index 0000000..1b93e77
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testJPEG_EXIF.jpg differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testJPEG_EXIF.jpg.gz b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testJPEG_EXIF.jpg.gz
new file mode 100644
index 0000000..2ee8e9c
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testJPEG_EXIF.jpg.gz differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testJPEG_EXIF.jpg.tar.gz b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testJPEG_EXIF.jpg.tar.gz
new file mode 100644
index 0000000..3f35102
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testJPEG_EXIF.jpg.tar.gz differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testKML.kml b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testKML.kml
new file mode 100644
index 0000000..5f17f62
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testKML.kml
@@ -0,0 +1,917 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<kml xmlns="http://www.opengis.net/kml/2.2">
+  <!-- Provided by Google as part of their KML Documentation -->
+  <!-- Available at https://developers.google.com/kml/documentation/KML_Samples.kml -->
+  <Document>
+    <name>KML Samples</name>
+    <open>1</open>
+    <description>Unleash your creativity with the help of these examples!</description>
+    <Style id="downArrowIcon">
+      <IconStyle>
+        <Icon>
+          <href>http://maps.google.com/mapfiles/kml/pal4/icon28.png</href>
+        </Icon>
+      </IconStyle>
+    </Style>
+    <Style id="globeIcon">
+      <IconStyle>
+        <Icon>
+          <href>http://maps.google.com/mapfiles/kml/pal3/icon19.png</href>
+        </Icon>
+      </IconStyle>
+      <LineStyle>
+        <width>2</width>
+      </LineStyle>
+    </Style>
+    <Style id="transPurpleLineGreenPoly">
+      <LineStyle>
+        <color>7fff00ff</color>
+        <width>4</width>
+      </LineStyle>
+      <PolyStyle>
+        <color>7f00ff00</color>
+      </PolyStyle>
+    </Style>
+    <Style id="yellowLineGreenPoly">
+      <LineStyle>
+        <color>7f00ffff</color>
+        <width>4</width>
+      </LineStyle>
+      <PolyStyle>
+        <color>7f00ff00</color>
+      </PolyStyle>
+    </Style>
+    <Style id="thickBlackLine">
+      <LineStyle>
+        <color>87000000</color>
+        <width>10</width>
+      </LineStyle>
+    </Style>
+    <Style id="redLineBluePoly">
+      <LineStyle>
+        <color>ff0000ff</color>
+      </LineStyle>
+      <PolyStyle>
+        <color>ffff0000</color>
+      </PolyStyle>
+    </Style>
+    <Style id="blueLineRedPoly">
+      <LineStyle>
+        <color>ffff0000</color>
+      </LineStyle>
+      <PolyStyle>
+        <color>ff0000ff</color>
+      </PolyStyle>
+    </Style>
+    <Style id="transRedPoly">
+      <LineStyle>
+        <width>1.5</width>
+      </LineStyle>
+      <PolyStyle>
+        <color>7d0000ff</color>
+      </PolyStyle>
+    </Style>
+    <Style id="transBluePoly">
+      <LineStyle>
+        <width>1.5</width>
+      </LineStyle>
+      <PolyStyle>
+        <color>7dff0000</color>
+      </PolyStyle>
+    </Style>
+    <Style id="transGreenPoly">
+      <LineStyle>
+        <width>1.5</width>
+      </LineStyle>
+      <PolyStyle>
+        <color>7d00ff00</color>
+      </PolyStyle>
+    </Style>
+    <Style id="transYellowPoly">
+      <LineStyle>
+        <width>1.5</width>
+      </LineStyle>
+      <PolyStyle>
+        <color>7d00ffff</color>
+      </PolyStyle>
+    </Style>
+    <Style id="noDrivingDirections">
+      <BalloonStyle>
+        <text><![CDATA[
+          <b>$[name]</b>
+          <br /><br />
+          $[description]
+        ]]></text>
+      </BalloonStyle>
+    </Style>
+    <Folder>
+      <name>Placemarks</name>
+      <description>These are just some of the different kinds of placemarks with
+        which you can mark your favorite places</description>
+      <LookAt>
+        <longitude>-122.0839597145766</longitude>
+        <latitude>37.42222904525232</latitude>
+        <altitude>0</altitude>
+        <heading>-148.4122922628044</heading>
+        <tilt>40.5575073395506</tilt>
+        <range>500.6566641072245</range>
+      </LookAt>
+      <Placemark>
+        <name>Simple placemark</name>
+        <description>Attached to the ground. Intelligently places itself at the
+          height of the underlying terrain.</description>
+        <Point>
+          <coordinates>-122.0822035425683,37.42228990140251,0</coordinates>
+        </Point>
+      </Placemark>
+      <Placemark>
+        <name>Floating placemark</name>
+        <visibility>0</visibility>
+        <description>Floats a defined distance above the ground.</description>
+        <LookAt>
+          <longitude>-122.0839597145766</longitude>
+          <latitude>37.42222904525232</latitude>
+          <altitude>0</altitude>
+          <heading>-148.4122922628044</heading>
+          <tilt>40.5575073395506</tilt>
+          <range>500.6566641072245</range>
+        </LookAt>
+        <styleUrl>#downArrowIcon</styleUrl>
+        <Point>
+          <altitudeMode>relativeToGround</altitudeMode>
+          <coordinates>-122.084075,37.4220033612141,50</coordinates>
+        </Point>
+      </Placemark>
+      <Placemark>
+        <name>Extruded placemark</name>
+        <visibility>0</visibility>
+        <description>Tethered to the ground by a customizable
+          &quot;tail&quot;</description>
+        <LookAt>
+          <longitude>-122.0845787421525</longitude>
+          <latitude>37.42215078737763</latitude>
+          <altitude>0</altitude>
+          <heading>-148.4126684946234</heading>
+          <tilt>40.55750733918048</tilt>
+          <range>365.2646606980322</range>
+        </LookAt>
+        <styleUrl>#globeIcon</styleUrl>
+        <Point>
+          <extrude>1</extrude>
+          <altitudeMode>relativeToGround</altitudeMode>
+          <coordinates>-122.0857667006183,37.42156927867553,50</coordinates>
+        </Point>
+      </Placemark>
+    </Folder>
+    <Folder>
+      <name>Styles and Markup</name>
+      <visibility>0</visibility>
+      <description>With KML it is easy to create rich, descriptive markup to
+        annotate and enrich your placemarks</description>
+      <LookAt>
+        <longitude>-122.0845787422371</longitude>
+        <latitude>37.42215078726837</latitude>
+        <altitude>0</altitude>
+        <heading>-148.4126777488172</heading>
+        <tilt>40.55750733930874</tilt>
+        <range>365.2646826292919</range>
+      </LookAt>
+      <styleUrl>#noDrivingDirections</styleUrl>
+      <Document>
+        <name>Highlighted Icon</name>
+        <visibility>0</visibility>
+        <description>Place your mouse over the icon to see it display the new
+          icon</description>
+        <LookAt>
+          <longitude>-122.0856552124024</longitude>
+          <latitude>37.4224281311035</latitude>
+          <altitude>0</altitude>
+          <heading>0</heading>
+          <tilt>0</tilt>
+          <range>265.8520424250024</range>
+        </LookAt>
+        <Style id="highlightPlacemark">
+          <IconStyle>
+            <Icon>
+              <href>http://maps.google.com/mapfiles/kml/paddle/red-stars.png</href>
+            </Icon>
+          </IconStyle>
+        </Style>
+        <Style id="normalPlacemark">
+          <IconStyle>
+            <Icon>
+              <href>http://maps.google.com/mapfiles/kml/paddle/wht-blank.png</href>
+            </Icon>
+          </IconStyle>
+        </Style>
+        <StyleMap id="exampleStyleMap">
+          <Pair>
+            <key>normal</key>
+            <styleUrl>#normalPlacemark</styleUrl>
+          </Pair>
+          <Pair>
+            <key>highlight</key>
+            <styleUrl>#highlightPlacemark</styleUrl>
+          </Pair>
+        </StyleMap>
+        <Placemark>
+          <name>Roll over this icon</name>
+          <visibility>0</visibility>
+          <styleUrl>#exampleStyleMap</styleUrl>
+          <Point>
+            <coordinates>-122.0856545755255,37.42243077405461,0</coordinates>
+          </Point>
+        </Placemark>
+      </Document>
+      <Placemark>
+        <name>Descriptive HTML</name>
+        <visibility>0</visibility>
+        <description><![CDATA[Click on the blue link!<br><br>
+Placemark descriptions can be enriched by using many standard HTML tags.<br>
+For example:
+<hr>
+Styles:<br>
+<i>Italics</i>, 
+<b>Bold</b>, 
+<u>Underlined</u>, 
+<s>Strike Out</s>, 
+subscript<sub>subscript</sub>, 
+superscript<sup>superscript</sup>, 
+<big>Big</big>, 
+<small>Small</small>, 
+<tt>Typewriter</tt>, 
+<em>Emphasized</em>, 
+<strong>Strong</strong>, 
+<code>Code</code>
+<hr>
+Fonts:<br> 
+<font color="red">red by name</font>, 
+<font color="#408010">leaf green by hexadecimal RGB</font>
+<br>
+<font size=1>size 1</font>, 
+<font size=2>size 2</font>, 
+<font size=3>size 3</font>, 
+<font size=4>size 4</font>, 
+<font size=5>size 5</font>, 
+<font size=6>size 6</font>, 
+<font size=7>size 7</font>
+<br>
+<font face=times>Times</font>, 
+<font face=verdana>Verdana</font>, 
+<font face=arial>Arial</font><br>
+<hr>
+Links: 
+<br>
+<a href="http://earth.google.com/">Google Earth!</a>
+<br>
+ or:  Check out our website at www.google.com
+<hr>
+Alignment:<br>
+<p align=left>left</p>
+<p align=center>center</p>
+<p align=right>right</p>
+<hr>
+Ordered Lists:<br>
+<ol><li>First</li><li>Second</li><li>Third</li></ol>
+<ol type="a"><li>First</li><li>Second</li><li>Third</li></ol>
+<ol type="A"><li>First</li><li>Second</li><li>Third</li></ol>
+<hr>
+Unordered Lists:<br>
+<ul><li>A</li><li>B</li><li>C</li></ul>
+<ul type="circle"><li>A</li><li>B</li><li>C</li></ul>
+<ul type="square"><li>A</li><li>B</li><li>C</li></ul>
+<hr>
+Definitions:<br>
+<dl>
+<dt>Google:</dt><dd>The best thing since sliced bread</dd>
+</dl>
+<hr>
+Centered:<br><center>
+Time present and time past<br>
+Are both perhaps present in time future,<br>
+And time future contained in time past.<br>
+If all time is eternally present<br>
+All time is unredeemable.<br>
+</center>
+<hr>
+Block Quote:
+<br>
+<blockquote>
+We shall not cease from exploration<br>
+And the end of all our exploring<br>
+Will be to arrive where we started<br>
+And know the place for the first time.<br>
+<i>-- T.S. Eliot</i>
+</blockquote>
+<br>
+<hr>
+Headings:<br>
+<h1>Header 1</h1>
+<h2>Header 2</h2>
+<h3>Header 3</h3>
+<h3>Header 4</h4>
+<h3>Header 5</h5>
+<hr>
+Images:<br>
+<i>Remote image</i><br>
+<img src="//developers.google.com/kml/documentation/images/googleSample.png"><br>
+<i>Scaled image</i><br>
+<img src="//developers.google.com/kml/documentation/images/googleSample.png" width=100><br>
+<hr>
+Simple Tables:<br>
+<table border="1" padding="1">
+<tr><td>1</td><td>2</td><td>3</td><td>4</td><td>5</td></tr>
+<tr><td>a</td><td>b</td><td>c</td><td>d</td><td>e</td></tr>
+</table>
+<br>
+[Did you notice that double-clicking on the placemark doesn't cause the viewer to take you anywhere? This is because it is possible to directly author a "placeless placemark". If you look at the code for this example, you will see that it has neither a point coordinate nor a LookAt element.]]]></description>
+      </Placemark>
+    </Folder>
+    <Folder>
+      <name>Ground Overlays</name>
+      <visibility>0</visibility>
+      <description>Examples of ground overlays</description>
+      <GroundOverlay>
+        <name>Large-scale overlay on terrain</name>
+        <visibility>0</visibility>
+        <description>Overlay shows Mount Etna erupting on July 13th, 2001.</description>
+        <LookAt>
+          <longitude>15.02468937557116</longitude>
+          <latitude>37.67395167941667</latitude>
+          <altitude>0</altitude>
+          <heading>-16.5581842842829</heading>
+          <tilt>58.31228652890705</tilt>
+          <range>30350.36838438907</range>
+        </LookAt>
+        <Icon>
+          <href>http://developers.google.com/kml/documentation/images/etna.jpg</href>
+        </Icon>
+        <LatLonBox>
+          <north>37.91904192681665</north>
+          <south>37.46543388598137</south>
+          <east>15.35832653742206</east>
+          <west>14.60128369746704</west>
+          <rotation>-0.1556640799496235</rotation>
+        </LatLonBox>
+      </GroundOverlay>
+    </Folder>
+    <Folder>
+      <name>Screen Overlays</name>
+      <visibility>0</visibility>
+      <description>Screen overlays have to be authored directly in KML. These
+        examples illustrate absolute and dynamic positioning in screen space.</description>
+      <ScreenOverlay>
+        <name>Simple crosshairs</name>
+        <visibility>0</visibility>
+        <description>This screen overlay uses fractional positioning to put the
+          image in the exact center of the screen</description>
+        <Icon>
+          <href>http://developers.google.com/kml/documentation/images/crosshairs.png</href>
+        </Icon>
+        <overlayXY x="0.5" y="0.5" xunits="fraction" yunits="fraction"/>
+        <screenXY x="0.5" y="0.5" xunits="fraction" yunits="fraction"/>
+        <rotationXY x="0.5" y="0.5" xunits="fraction" yunits="fraction"/>
+        <size x="0" y="0" xunits="pixels" yunits="pixels"/>
+      </ScreenOverlay>
+      <ScreenOverlay>
+        <name>Absolute Positioning: Top left</name>
+        <visibility>0</visibility>
+        <Icon>
+          <href>http://developers.google.com/kml/documentation/images/top_left.jpg</href>
+        </Icon>
+        <overlayXY x="0" y="1" xunits="fraction" yunits="fraction"/>
+        <screenXY x="0" y="1" xunits="fraction" yunits="fraction"/>
+        <rotationXY x="0" y="0" xunits="fraction" yunits="fraction"/>
+        <size x="0" y="0" xunits="fraction" yunits="fraction"/>
+      </ScreenOverlay>
+      <ScreenOverlay>
+        <name>Absolute Positioning: Top right</name>
+        <visibility>0</visibility>
+        <Icon>
+          <href>http://developers.google.com/kml/documentation/images/top_right.jpg</href>
+        </Icon>
+        <overlayXY x="1" y="1" xunits="fraction" yunits="fraction"/>
+        <screenXY x="1" y="1" xunits="fraction" yunits="fraction"/>
+        <rotationXY x="0" y="0" xunits="fraction" yunits="fraction"/>
+        <size x="0" y="0" xunits="fraction" yunits="fraction"/>
+      </ScreenOverlay>
+      <ScreenOverlay>
+        <name>Absolute Positioning: Bottom left</name>
+        <visibility>0</visibility>
+        <Icon>
+          <href>http://developers.google.com/kml/documentation/images/bottom_left.jpg</href>
+        </Icon>
+        <overlayXY x="0" y="-1" xunits="fraction" yunits="fraction"/>
+        <screenXY x="0" y="0" xunits="fraction" yunits="fraction"/>
+        <rotationXY x="0" y="0" xunits="fraction" yunits="fraction"/>
+        <size x="0" y="0" xunits="fraction" yunits="fraction"/>
+      </ScreenOverlay>
+      <ScreenOverlay>
+        <name>Absolute Positioning: Bottom right</name>
+        <visibility>0</visibility>
+        <Icon>
+          <href>http://developers.google.com/kml/documentation/images/bottom_right.jpg</href>
+        </Icon>
+        <overlayXY x="1" y="-1" xunits="fraction" yunits="fraction"/>
+        <screenXY x="1" y="0" xunits="fraction" yunits="fraction"/>
+        <rotationXY x="0" y="0" xunits="fraction" yunits="fraction"/>
+        <size x="0" y="0" xunits="fraction" yunits="fraction"/>
+      </ScreenOverlay>
+      <ScreenOverlay>
+        <name>Dynamic Positioning: Top of screen</name>
+        <visibility>0</visibility>
+        <Icon>
+          <href>http://developers.google.com/kml/documentation/images/dynamic_screenoverlay.jpg</href>
+        </Icon>
+        <overlayXY x="0" y="1" xunits="fraction" yunits="fraction"/>
+        <screenXY x="0" y="1" xunits="fraction" yunits="fraction"/>
+        <rotationXY x="0" y="0" xunits="fraction" yunits="fraction"/>
+        <size x="1" y="0.2" xunits="fraction" yunits="fraction"/>
+      </ScreenOverlay>
+      <ScreenOverlay>
+        <name>Dynamic Positioning: Right of screen</name>
+        <visibility>0</visibility>
+        <Icon>
+          <href>http://developers.google.com/kml/documentation/images/dynamic_right.jpg</href>
+        </Icon>
+        <overlayXY x="1" y="1" xunits="fraction" yunits="fraction"/>
+        <screenXY x="1" y="1" xunits="fraction" yunits="fraction"/>
+        <rotationXY x="0" y="0" xunits="fraction" yunits="fraction"/>
+        <size x="0" y="1" xunits="fraction" yunits="fraction"/>
+      </ScreenOverlay>
+    </Folder>
+    <Folder>
+      <name>Paths</name>
+      <visibility>0</visibility>
+      <description>Examples of paths. Note that the tessellate tag is by default
+        set to 0. If you want to create tessellated lines, they must be authored
+        (or edited) directly in KML.</description>
+      <Placemark>
+        <name>Tessellated</name>
+        <visibility>0</visibility>
+        <description><![CDATA[If the <tessellate> tag has a value of 1, the line will contour to the underlying terrain]]></description>
+        <LookAt>
+          <longitude>-112.0822680013139</longitude>
+          <latitude>36.09825589333556</latitude>
+          <altitude>0</altitude>
+          <heading>103.8120432044965</heading>
+          <tilt>62.04855796276328</tilt>
+          <range>2889.145007690472</range>
+        </LookAt>
+        <LineString>
+          <tessellate>1</tessellate>
+          <coordinates> -112.0814237830345,36.10677870477137,0
+            -112.0870267752693,36.0905099328766,0 </coordinates>
+        </LineString>
+      </Placemark>
+      <Placemark>
+        <name>Untessellated</name>
+        <visibility>0</visibility>
+        <description><![CDATA[If the <tessellate> tag has a value of 0, the line follow a simple straight-line path from point to point]]></description>
+        <LookAt>
+          <longitude>-112.0822680013139</longitude>
+          <latitude>36.09825589333556</latitude>
+          <altitude>0</altitude>
+          <heading>103.8120432044965</heading>
+          <tilt>62.04855796276328</tilt>
+          <range>2889.145007690472</range>
+        </LookAt>
+        <LineString>
+          <tessellate>0</tessellate>
+          <coordinates> -112.080622229595,36.10673460007995,0
+            -112.085242575315,36.09049598612422,0 </coordinates>
+        </LineString>
+      </Placemark>
+      <Placemark>
+        <name>Absolute</name>
+        <visibility>0</visibility>
+        <description>Transparent purple line</description>
+        <LookAt>
+          <longitude>-112.2719329043177</longitude>
+          <latitude>36.08890633450894</latitude>
+          <altitude>0</altitude>
+          <heading>-106.8161545998597</heading>
+          <tilt>44.60763714063257</tilt>
+          <range>2569.386744398339</range>
+        </LookAt>
+        <styleUrl>#transPurpleLineGreenPoly</styleUrl>
+        <LineString>
+          <tessellate>1</tessellate>
+          <altitudeMode>absolute</altitudeMode>
+          <coordinates> -112.265654928602,36.09447672602546,2357
+            -112.2660384528238,36.09342608838671,2357
+            -112.2668139013453,36.09251058776881,2357
+            -112.2677826834445,36.09189827357996,2357
+            -112.2688557510952,36.0913137941187,2357
+            -112.2694810717219,36.0903677207521,2357
+            -112.2695268555611,36.08932171487285,2357
+            -112.2690144567276,36.08850916060472,2357
+            -112.2681528815339,36.08753813597956,2357
+            -112.2670588176031,36.08682685262568,2357
+            -112.2657374587321,36.08646312301303,2357 </coordinates>
+        </LineString>
+      </Placemark>
+      <Placemark>
+        <name>Absolute Extruded</name>
+        <visibility>0</visibility>
+        <description>Transparent green wall with yellow outlines</description>
+        <LookAt>
+          <longitude>-112.2643334742529</longitude>
+          <latitude>36.08563154742419</latitude>
+          <altitude>0</altitude>
+          <heading>-125.7518698668815</heading>
+          <tilt>44.61038665812578</tilt>
+          <range>4451.842204068102</range>
+        </LookAt>
+        <styleUrl>#yellowLineGreenPoly</styleUrl>
+        <LineString>
+          <extrude>1</extrude>
+          <tessellate>1</tessellate>
+          <altitudeMode>absolute</altitudeMode>
+          <coordinates> -112.2550785337791,36.07954952145647,2357
+            -112.2549277039738,36.08117083492122,2357
+            -112.2552505069063,36.08260761307279,2357
+            -112.2564540158376,36.08395660588506,2357
+            -112.2580238976449,36.08511401044813,2357
+            -112.2595218489022,36.08584355239394,2357
+            -112.2608216347552,36.08612634548589,2357
+            -112.262073428656,36.08626019085147,2357
+            -112.2633204928495,36.08621519860091,2357
+            -112.2644963846444,36.08627897945274,2357
+            -112.2656969554589,36.08649599090644,2357 </coordinates>
+        </LineString>
+      </Placemark>
+      <Placemark>
+        <name>Relative</name>
+        <visibility>0</visibility>
+        <description>Black line (10 pixels wide), height tracks terrain</description>
+        <LookAt>
+          <longitude>-112.2580438551384</longitude>
+          <latitude>36.1072674824385</latitude>
+          <altitude>0</altitude>
+          <heading>4.947421249553717</heading>
+          <tilt>44.61324882043339</tilt>
+          <range>2927.61105910266</range>
+        </LookAt>
+        <styleUrl>#thickBlackLine</styleUrl>
+        <LineString>
+          <tessellate>1</tessellate>
+          <altitudeMode>relativeToGround</altitudeMode>
+          <coordinates> -112.2532845153347,36.09886943729116,645
+            -112.2540466121145,36.09919570465255,645
+            -112.254734666947,36.09984998366178,645
+            -112.255493345654,36.10051310621746,645
+            -112.2563157098468,36.10108441943419,645
+            -112.2568033076439,36.10159722088088,645
+            -112.257494011321,36.10204323542867,645
+            -112.2584106072308,36.10229131995655,645
+            -112.2596588987972,36.10240001286358,645
+            -112.2610581199487,36.10213176873407,645
+            -112.2626285262793,36.10157011437219,645 </coordinates>
+        </LineString>
+      </Placemark>
+      <Placemark>
+        <name>Relative Extruded</name>
+        <visibility>0</visibility>
+        <description>Opaque blue walls with red outline, height tracks terrain</description>
+        <LookAt>
+          <longitude>-112.2683594333433</longitude>
+          <latitude>36.09884362144909</latitude>
+          <altitude>0</altitude>
+          <heading>-72.24271551768405</heading>
+          <tilt>44.60855445139561</tilt>
+          <range>2184.193522571467</range>
+        </LookAt>
+        <styleUrl>#redLineBluePoly</styleUrl>
+        <LineString>
+          <extrude>1</extrude>
+          <tessellate>1</tessellate>
+          <altitudeMode>relativeToGround</altitudeMode>
+          <coordinates> -112.2656634181359,36.09445214722695,630
+            -112.2652238941097,36.09520916122063,630
+            -112.2645079986395,36.09580763864907,630
+            -112.2638827428817,36.09628572284063,630
+            -112.2635746835406,36.09679275951239,630
+            -112.2635711822407,36.09740038871899,630
+            -112.2640296531825,36.09804913435539,630
+            -112.264327720538,36.09880337400301,630
+            -112.2642436562271,36.09963644790288,630
+            -112.2639148687042,36.10055381117246,630
+            -112.2626894973474,36.10149062823369,630 </coordinates>
+        </LineString>
+      </Placemark>
+    </Folder>
+    <Folder>
+      <name>Polygons</name>
+      <visibility>0</visibility>
+      <description>Examples of polygon shapes</description>
+      <Folder>
+        <name>Google Campus</name>
+        <visibility>0</visibility>
+        <description>A collection showing how easy it is to create 3-dimensional
+          buildings</description>
+        <LookAt>
+          <longitude>-122.084120030116</longitude>
+          <latitude>37.42174011925477</latitude>
+          <altitude>0</altitude>
+          <heading>-34.82469740081282</heading>
+          <tilt>53.454348562403</tilt>
+          <range>276.7870053764046</range>
+        </LookAt>
+        <Placemark>
+          <name>Building 40</name>
+          <visibility>0</visibility>
+          <styleUrl>#transRedPoly</styleUrl>
+          <Polygon>
+            <extrude>1</extrude>
+            <altitudeMode>relativeToGround</altitudeMode>
+            <outerBoundaryIs>
+              <LinearRing>
+                <coordinates> -122.0848938459612,37.42257124044786,17
+                  -122.0849580979198,37.42211922626856,17
+                  -122.0847469573047,37.42207183952619,17
+                  -122.0845725380962,37.42209006729676,17
+                  -122.0845954886723,37.42215932700895,17
+                  -122.0838521118269,37.42227278564371,17
+                  -122.083792243335,37.42203539112084,17
+                  -122.0835076656616,37.42209006957106,17
+                  -122.0834709464152,37.42200987395161,17
+                  -122.0831221085748,37.4221046494946,17
+                  -122.0829247374572,37.42226503990386,17
+                  -122.0829339169385,37.42231242843094,17
+                  -122.0833837359737,37.42225046087618,17
+                  -122.0833607854248,37.42234159228745,17
+                  -122.0834204551642,37.42237075460644,17
+                  -122.083659133885,37.42251292011001,17
+                  -122.0839758438952,37.42265873093781,17
+                  -122.0842374743331,37.42265143972521,17
+                  -122.0845036949503,37.4226514386435,17
+                  -122.0848020460801,37.42261133916315,17
+                  -122.0847882750515,37.42256395055121,17
+                  -122.0848938459612,37.42257124044786,17 </coordinates>
+              </LinearRing>
+            </outerBoundaryIs>
+          </Polygon>
+        </Placemark>
+        <Placemark>
+          <name>Building 41</name>
+          <visibility>0</visibility>
+          <styleUrl>#transBluePoly</styleUrl>
+          <Polygon>
+            <extrude>1</extrude>
+            <altitudeMode>relativeToGround</altitudeMode>
+            <outerBoundaryIs>
+              <LinearRing>
+                <coordinates> -122.0857412771483,37.42227033155257,17
+                  -122.0858169768481,37.42231408832346,17
+                  -122.085852582875,37.42230337469744,17
+                  -122.0858799945639,37.42225686138789,17
+                  -122.0858860101409,37.4222311076138,17
+                  -122.0858069157288,37.42220250173855,17
+                  -122.0858379542653,37.42214027058678,17
+                  -122.0856732640519,37.42208690214408,17
+                  -122.0856022926407,37.42214885429042,17
+                  -122.0855902778436,37.422128290487,17
+                  -122.0855841672237,37.42208171967246,17
+                  -122.0854852065741,37.42210455874995,17
+                  -122.0855067264352,37.42214267949824,17
+                  -122.0854430712915,37.42212783846172,17
+                  -122.0850990714904,37.42251282407603,17
+                  -122.0856769818632,37.42281815323651,17
+                  -122.0860162273783,37.42244918858722,17
+                  -122.0857260327004,37.42229239604253,17
+                  -122.0857412771483,37.42227033155257,17 </coordinates>
+              </LinearRing>
+            </outerBoundaryIs>
+          </Polygon>
+        </Placemark>
+        <Placemark>
+          <name>Building 42</name>
+          <visibility>0</visibility>
+          <styleUrl>#transGreenPoly</styleUrl>
+          <Polygon>
+            <extrude>1</extrude>
+            <altitudeMode>relativeToGround</altitudeMode>
+            <outerBoundaryIs>
+              <LinearRing>
+                <coordinates> -122.0857862287242,37.42136208886969,25
+                  -122.0857312990603,37.42136935989481,25
+                  -122.0857312992918,37.42140934910903,25
+                  -122.0856077073679,37.42138390166565,25
+                  -122.0855802426516,37.42137299550869,25
+                  -122.0852186221971,37.42137299504316,25
+                  -122.0852277765639,37.42161656508265,25
+                  -122.0852598189347,37.42160565894403,25
+                  -122.0852598185499,37.42168200156,25
+                  -122.0852369311478,37.42170017860346,25
+                  -122.0852643957828,37.42176197982575,25
+                  -122.0853239032746,37.42176198013907,25
+                  -122.0853559454324,37.421852864452,25
+                  -122.0854108752463,37.42188921823734,25
+                  -122.0854795379357,37.42189285337048,25
+                  -122.0855436229819,37.42188921797546,25
+                  -122.0856260178042,37.42186013499926,25
+                  -122.085937287963,37.42186013453605,25
+                  -122.0859428718666,37.42160898590042,25
+                  -122.0859655469861,37.42157992759144,25
+                  -122.0858640462341,37.42147115002957,25
+                  -122.0858548911215,37.42140571326184,25
+                  -122.0858091162768,37.4214057134039,25
+                  -122.0857862287242,37.42136208886969,25 </coordinates>
+              </LinearRing>
+            </outerBoundaryIs>
+          </Polygon>
+        </Placemark>
+        <Placemark>
+          <name>Building 43</name>
+          <visibility>0</visibility>
+          <styleUrl>#transYellowPoly</styleUrl>
+          <Polygon>
+            <extrude>1</extrude>
+            <altitudeMode>relativeToGround</altitudeMode>
+            <outerBoundaryIs>
+              <LinearRing>
+                <coordinates> -122.0844371128284,37.42177253003091,19
+                  -122.0845118855746,37.42191111542896,19
+                  -122.0850470999805,37.42178755121535,19
+                  -122.0850719913391,37.42143663023161,19
+                  -122.084916406232,37.42137237822116,19
+                  -122.0842193868167,37.42137237801626,19
+                  -122.08421938659,37.42147617161496,19
+                  -122.0838086419991,37.4214613409357,19
+                  -122.0837899728564,37.42131306410796,19
+                  -122.0832796534698,37.42129328840593,19
+                  -122.0832609819207,37.42139213944298,19
+                  -122.0829373621737,37.42137236399876,19
+                  -122.0829062425667,37.42151569778871,19
+                  -122.0828502269665,37.42176282576465,19
+                  -122.0829435788635,37.42176776969635,19
+                  -122.083217411188,37.42179248552686,19
+                  -122.0835970430103,37.4217480074456,19
+                  -122.0839455556771,37.42169364237603,19
+                  -122.0840077894637,37.42176283815853,19
+                  -122.084113587521,37.42174801104392,19
+                  -122.0840762473784,37.42171341292375,19
+                  -122.0841447047739,37.42167881534569,19
+                  -122.084144704223,37.42181720660197,19
+                  -122.0842503333074,37.4218170700446,19
+                  -122.0844371128284,37.42177253003091,19 </coordinates>
+              </LinearRing>
+            </outerBoundaryIs>
+          </Polygon>
+        </Placemark>
+      </Folder>
+      <Folder>
+        <name>Extruded Polygon</name>
+        <description>A simple way to model a building</description>
+        <Placemark>
+          <name>The Pentagon</name>
+          <LookAt>
+            <longitude>-77.05580139178142</longitude>
+            <latitude>38.870832443487</latitude>
+            <heading>59.88865561738225</heading>
+            <tilt>48.09646074797388</tilt>
+            <range>742.0552506670548</range>
+          </LookAt>
+          <Polygon>
+            <extrude>1</extrude>
+            <altitudeMode>relativeToGround</altitudeMode>
+            <outerBoundaryIs>
+              <LinearRing>
+                <coordinates> -77.05788457660967,38.87253259892824,100
+                  -77.05465973756702,38.87291016281703,100
+                  -77.05315536854791,38.87053267794386,100
+                  -77.05552622493516,38.868757801256,100
+                  -77.05844056290393,38.86996206506943,100
+                  -77.05788457660967,38.87253259892824,100 </coordinates>
+              </LinearRing>
+            </outerBoundaryIs>
+            <innerBoundaryIs>
+              <LinearRing>
+                <coordinates> -77.05668055019126,38.87154239798456,100
+                  -77.05542625960818,38.87167890344077,100
+                  -77.05485125901024,38.87076535397792,100
+                  -77.05577677433152,38.87008686581446,100
+                  -77.05691162017543,38.87054446963351,100
+                  -77.05668055019126,38.87154239798456,100 </coordinates>
+              </LinearRing>
+            </innerBoundaryIs>
+          </Polygon>
+        </Placemark>
+      </Folder>
+      <Folder>
+        <name>Absolute and Relative</name>
+        <visibility>0</visibility>
+        <description>Four structures whose roofs meet exactly. Turn on/off
+          terrain to see the difference between relative and absolute
+          positioning.</description>
+        <LookAt>
+          <longitude>-112.3348969157552</longitude>
+          <latitude>36.14845533214919</latitude>
+          <altitude>0</altitude>
+          <heading>-86.91235037566909</heading>
+          <tilt>49.30695423894192</tilt>
+          <range>990.6761201087104</range>
+        </LookAt>
+        <Placemark>
+          <name>Absolute</name>
+          <visibility>0</visibility>
+          <styleUrl>#transBluePoly</styleUrl>
+          <Polygon>
+            <tessellate>1</tessellate>
+            <altitudeMode>absolute</altitudeMode>
+            <outerBoundaryIs>
+              <LinearRing>
+                <coordinates> -112.3372510731295,36.14888505105317,1784
+                  -112.3356128688403,36.14781540589019,1784
+                  -112.3368169371048,36.14658677734382,1784
+                  -112.3384408457543,36.14762778914076,1784
+                  -112.3372510731295,36.14888505105317,1784 </coordinates>
+              </LinearRing>
+            </outerBoundaryIs>
+          </Polygon>
+        </Placemark>
+        <Placemark>
+          <name>Absolute Extruded</name>
+          <visibility>0</visibility>
+          <styleUrl>#transRedPoly</styleUrl>
+          <Polygon>
+            <extrude>1</extrude>
+            <tessellate>1</tessellate>
+            <altitudeMode>absolute</altitudeMode>
+            <outerBoundaryIs>
+              <LinearRing>
+                <coordinates> -112.3396586818843,36.14637618647505,1784
+                  -112.3380597654315,36.14531751871353,1784
+                  -112.3368254237788,36.14659596244607,1784
+                  -112.3384555043203,36.14762621763982,1784
+                  -112.3396586818843,36.14637618647505,1784 </coordinates>
+              </LinearRing>
+            </outerBoundaryIs>
+          </Polygon>
+        </Placemark>
+        <Placemark>
+          <name>Relative</name>
+          <visibility>0</visibility>
+          <LookAt>
+            <longitude>-112.3350152490417</longitude>
+            <latitude>36.14943123077423</latitude>
+            <altitude>0</altitude>
+            <heading>-118.9214100848499</heading>
+            <tilt>37.92486261093203</tilt>
+            <range>345.5169113679813</range>
+          </LookAt>
+          <styleUrl>#transGreenPoly</styleUrl>
+          <Polygon>
+            <tessellate>1</tessellate>
+            <altitudeMode>relativeToGround</altitudeMode>
+            <outerBoundaryIs>
+              <LinearRing>
+                <coordinates> -112.3349463145932,36.14988705767721,100
+                  -112.3354019540677,36.14941108398372,100
+                  -112.3344428289146,36.14878490381308,100
+                  -112.3331289492913,36.14780840132443,100
+                  -112.3317019516947,36.14680755678357,100
+                  -112.331131440106,36.1474173426228,100
+                  -112.332616324338,36.14845453364654,100
+                  -112.3339876620524,36.14926570522069,100
+                  -112.3349463145932,36.14988705767721,100 </coordinates>
+              </LinearRing>
+            </outerBoundaryIs>
+          </Polygon>
+        </Placemark>
+        <Placemark>
+          <name>Relative Extruded</name>
+          <visibility>0</visibility>
+          <LookAt>
+            <longitude>-112.3351587892382</longitude>
+            <latitude>36.14979247129029</latitude>
+            <altitude>0</altitude>
+            <heading>-55.42811560891606</heading>
+            <tilt>56.10280503739589</tilt>
+            <range>401.0997279712519</range>
+          </LookAt>
+          <styleUrl>#transYellowPoly</styleUrl>
+          <Polygon>
+            <extrude>1</extrude>
+            <tessellate>1</tessellate>
+            <altitudeMode>relativeToGround</altitudeMode>
+            <outerBoundaryIs>
+              <LinearRing>
+                <coordinates> -112.3348783983763,36.1514008468736,100
+                  -112.3372535345629,36.14888517553886,100
+                  -112.3356068927954,36.14781612679284,100
+                  -112.3350034807972,36.14846469024177,100
+                  -112.3358353861232,36.1489624162954,100
+                  -112.3345888301373,36.15026229372507,100
+                  -112.3337937856278,36.14978096026463,100
+                  -112.3331798208424,36.1504472788618,100
+                  -112.3348783983763,36.1514008468736,100 </coordinates>
+              </LinearRing>
+            </outerBoundaryIs>
+          </Polygon>
+        </Placemark>
+      </Folder>
+    </Folder>
+  </Document>
+</kml>
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testKeynote.key b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testKeynote.key
new file mode 100644
index 0000000..6e0e032
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testKeynote.key differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testMP3i18n.mp3 b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testMP3i18n.mp3
new file mode 100644
index 0000000..0f25370
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testMP3i18n.mp3 differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testMP4.m4a b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testMP4.m4a
new file mode 100644
index 0000000..a9bc731
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testMP4.m4a differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testNumbers.numbers b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testNumbers.numbers
new file mode 100644
index 0000000..51360e0
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testNumbers.numbers differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testPDF.pdf b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testPDF.pdf
new file mode 100644
index 0000000..1f1bcff
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testPDF.pdf differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testPNG.png b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testPNG.png
new file mode 100644
index 0000000..afbcb5f
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testPNG.png differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testPPM.ppm b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testPPM.ppm
new file mode 100644
index 0000000..63ad5b1
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testPPM.ppm
@@ -0,0 +1,4 @@
+P3
+1 1
+255
+0 0 0
\ No newline at end of file
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testPPT_various.ppt b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testPPT_various.ppt
new file mode 100644
index 0000000..75829de
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testPPT_various.ppt differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testPPT_various.pptx b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testPPT_various.pptx
new file mode 100644
index 0000000..92c2744
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testPPT_various.pptx differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testPSD.psd b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testPSD.psd
new file mode 100644
index 0000000..7cedbc2
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testPSD.psd differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testPages.pages b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testPages.pages
new file mode 100644
index 0000000..9fe1e40
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testPages.pages differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testRDF.rdf b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testRDF.rdf
new file mode 100644
index 0000000..04b3da7
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testRDF.rdf
@@ -0,0 +1,23 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements.  See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
+         xmlns:dc="http://purl.org/dc/elements/1.1/">
+  <rdf:Description
+      rdf:about="http://lucene.apache.org/tika/"
+      dc:title="Apache Tika"/>
+</rdf:RDF>
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testRFC822 b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testRFC822
new file mode 100644
index 0000000..9ce423a
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testRFC822
@@ -0,0 +1,41 @@
+From: "Julien Nioche (JIRA)" <jira@apache.org>
+To: dev@tika.apache.org
+Subject: [jira] Commented: (TIKA-461) RFC822 messages not parsed
+Reply-To: dev@tika.apache.org
+Delivered-To: mailing list dev@tika.apache.org
+Date: Mon, 6 Sep 2010 05:25:34 -0400 (EDT)
+In-Reply-To: <6089099.260231278600349994.JavaMail.jira@thor>
+MIME-Version: 1.0
+Content-Type: text/plain; charset=utf-8
+Content-Transfer-Encoding: 7bit
+X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394
+X-Virus-Checked: Checked by ClamAV on apache.org
+
+
+    [ https://issues.apache.org/jira/browse/TIKA-461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12906468#action_12906468 ] 
+
+Julien Nioche commented on TIKA-461:
+------------------------------------
+
+I'll have a look at mime4j and try to use it in Tika
+
+> RFC822 messages not parsed
+> --------------------------
+>
+>                 Key: TIKA-461
+>                 URL: https://issues.apache.org/jira/browse/TIKA-461
+>             Project: Tika
+>          Issue Type: Bug
+>          Components: parser
+>    Affects Versions: 0.7
+>            Reporter: Joshua Turner
+>            Assignee: Julien Nioche
+>
+> Presented with an RFC822 message exported from Thunderbird, AutodetectParser produces an empty body, and a Metadata containing only one key-value pair: "Content-Type=message/rfc822". Directly calling MboxParser likewise gives an empty body, but with two metadata pairs: "Content-Encoding=us-ascii Content-Type=application/mbox".
+> A quick peek at the source of MboxParser shows that the implementation is pretty naive. If the wiring can be sorted out, something like Apache James' mime4j might be a better bet.
+
+-- 
+This message is automatically generated by JIRA.
+-
+You can reply to this email to add a comment to the issue online.
+
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testRTFVarious.rtf b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testRTFVarious.rtf
new file mode 100644
index 0000000..57fadb9
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testRTFVarious.rtf
@@ -0,0 +1,329 @@
+{\rtf1\adeflang1025\ansi\ansicpg1252\uc1\adeff31507\deff0\stshfdbch31506\stshfloch31506\stshfhich31506\stshfbi31507\deflang1033\deflangfe1033\themelang1033\themelangfe0\themelangcs0{\fonttbl{\f0\fbidi \froman\fcharset0\fprq2{\*\panose 02020603050405020304}Times New Roman;}{\f1\fbidi \fswiss\fcharset0\fprq2{\*\panose 020b0604020202020204}Arial;}
+{\f2\fbidi \fmodern\fcharset0\fprq1{\*\panose 02070309020205020404}Courier New;}{\f3\fbidi \froman\fcharset2\fprq2{\*\panose 05050102010706020507}Symbol;}{\f10\fbidi \fnil\fcharset2\fprq2{\*\panose 05000000000000000000}Wingdings;}
+{\f11\fbidi \fmodern\fcharset128\fprq1{\*\panose 02020609040205080304}MS Mincho{\*\falt \'82\'6c\'82\'72 \'96\'be\'92\'a9};}{\f15\fbidi \fmodern\fcharset128\fprq1{\*\panose 020b0609070205080204}MS Gothic{\*\falt MS Mincho};}
+{\f34\fbidi \froman\fcharset0\fprq2{\*\panose 02040503050406030204}Cambria Math;}{\f37\fbidi \fswiss\fcharset0\fprq2{\*\panose 020f0502020204030204}Calibri;}{\f38\fbidi \fswiss\fcharset0\fprq2{\*\panose 020b0604030504040204}Tahoma;}
+{\f175\fbidi \fmodern\fcharset128\fprq1{\*\panose 02020609040205080304}@MS Mincho;}{\f209\fbidi \fmodern\fcharset128\fprq1{\*\panose 00000000000000000000}@MS Gothic;}
+{\flomajor\f31500\fbidi \froman\fcharset0\fprq2{\*\panose 02020603050405020304}Times New Roman;}{\fdbmajor\f31501\fbidi \froman\fcharset0\fprq2{\*\panose 02020603050405020304}Times New Roman;}
+{\fhimajor\f31502\fbidi \froman\fcharset0\fprq2{\*\panose 02040503050406030204}Cambria;}{\fbimajor\f31503\fbidi \froman\fcharset0\fprq2{\*\panose 02020603050405020304}Times New Roman;}
+{\flominor\f31504\fbidi \froman\fcharset0\fprq2{\*\panose 02020603050405020304}Times New Roman;}{\fdbminor\f31505\fbidi \froman\fcharset0\fprq2{\*\panose 02020603050405020304}Times New Roman;}
+{\fhiminor\f31506\fbidi \fswiss\fcharset0\fprq2{\*\panose 020f0502020204030204}Calibri;}{\fbiminor\f31507\fbidi \froman\fcharset0\fprq2{\*\panose 02020603050405020304}Times New Roman;}{\f210\fbidi \froman\fcharset238\fprq2 Times New Roman CE;}
+{\f211\fbidi \froman\fcharset204\fprq2 Times New Roman Cyr;}{\f213\fbidi \froman\fcharset161\fprq2 Times New Roman Greek;}{\f214\fbidi \froman\fcharset162\fprq2 Times New Roman Tur;}{\f215\fbidi \froman\fcharset177\fprq2 Times New Roman (Hebrew);}
+{\f216\fbidi \froman\fcharset178\fprq2 Times New Roman (Arabic);}{\f217\fbidi \froman\fcharset186\fprq2 Times New Roman Baltic;}{\f218\fbidi \froman\fcharset163\fprq2 Times New Roman (Vietnamese);}{\f220\fbidi \fswiss\fcharset238\fprq2 Arial CE;}
+{\f221\fbidi \fswiss\fcharset204\fprq2 Arial Cyr;}{\f223\fbidi \fswiss\fcharset161\fprq2 Arial Greek;}{\f224\fbidi \fswiss\fcharset162\fprq2 Arial Tur;}{\f225\fbidi \fswiss\fcharset177\fprq2 Arial (Hebrew);}
+{\f226\fbidi \fswiss\fcharset178\fprq2 Arial (Arabic);}{\f227\fbidi \fswiss\fcharset186\fprq2 Arial Baltic;}{\f228\fbidi \fswiss\fcharset163\fprq2 Arial (Vietnamese);}{\f230\fbidi \fmodern\fcharset238\fprq1 Courier New CE;}
+{\f231\fbidi \fmodern\fcharset204\fprq1 Courier New Cyr;}{\f233\fbidi \fmodern\fcharset161\fprq1 Courier New Greek;}{\f234\fbidi \fmodern\fcharset162\fprq1 Courier New Tur;}{\f235\fbidi \fmodern\fcharset177\fprq1 Courier New (Hebrew);}
+{\f236\fbidi \fmodern\fcharset178\fprq1 Courier New (Arabic);}{\f237\fbidi \fmodern\fcharset186\fprq1 Courier New Baltic;}{\f238\fbidi \fmodern\fcharset163\fprq1 Courier New (Vietnamese);}
+{\f322\fbidi \fmodern\fcharset0\fprq1 MS Mincho Western{\*\falt \'82\'6c\'82\'72 \'96\'be\'92\'a9};}{\f320\fbidi \fmodern\fcharset238\fprq1 MS Mincho CE{\*\falt \'82\'6c\'82\'72 \'96\'be\'92\'a9};}
+{\f321\fbidi \fmodern\fcharset204\fprq1 MS Mincho Cyr{\*\falt \'82\'6c\'82\'72 \'96\'be\'92\'a9};}{\f323\fbidi \fmodern\fcharset161\fprq1 MS Mincho Greek{\*\falt \'82\'6c\'82\'72 \'96\'be\'92\'a9};}
+{\f324\fbidi \fmodern\fcharset162\fprq1 MS Mincho Tur{\*\falt \'82\'6c\'82\'72 \'96\'be\'92\'a9};}{\f327\fbidi \fmodern\fcharset186\fprq1 MS Mincho Baltic{\*\falt \'82\'6c\'82\'72 \'96\'be\'92\'a9};}{\f550\fbidi \froman\fcharset238\fprq2 Cambria Math CE;}
+{\f551\fbidi \froman\fcharset204\fprq2 Cambria Math Cyr;}{\f553\fbidi \froman\fcharset161\fprq2 Cambria Math Greek;}{\f554\fbidi \froman\fcharset162\fprq2 Cambria Math Tur;}{\f557\fbidi \froman\fcharset186\fprq2 Cambria Math Baltic;}
+{\f580\fbidi \fswiss\fcharset238\fprq2 Calibri CE;}{\f581\fbidi \fswiss\fcharset204\fprq2 Calibri Cyr;}{\f583\fbidi \fswiss\fcharset161\fprq2 Calibri Greek;}{\f584\fbidi \fswiss\fcharset162\fprq2 Calibri Tur;}
+{\f587\fbidi \fswiss\fcharset186\fprq2 Calibri Baltic;}{\f590\fbidi \fswiss\fcharset238\fprq2 Tahoma CE;}{\f591\fbidi \fswiss\fcharset204\fprq2 Tahoma Cyr;}{\f593\fbidi \fswiss\fcharset161\fprq2 Tahoma Greek;}
+{\f594\fbidi \fswiss\fcharset162\fprq2 Tahoma Tur;}{\f595\fbidi \fswiss\fcharset177\fprq2 Tahoma (Hebrew);}{\f596\fbidi \fswiss\fcharset178\fprq2 Tahoma (Arabic);}{\f597\fbidi \fswiss\fcharset186\fprq2 Tahoma Baltic;}
+{\f598\fbidi \fswiss\fcharset163\fprq2 Tahoma (Vietnamese);}{\f599\fbidi \fswiss\fcharset222\fprq2 Tahoma (Thai);}{\f1962\fbidi \fmodern\fcharset0\fprq1 @MS Mincho Western;}{\f1960\fbidi \fmodern\fcharset238\fprq1 @MS Mincho CE;}
+{\f1961\fbidi \fmodern\fcharset204\fprq1 @MS Mincho Cyr;}{\f1963\fbidi \fmodern\fcharset161\fprq1 @MS Mincho Greek;}{\f1964\fbidi \fmodern\fcharset162\fprq1 @MS Mincho Tur;}{\f1967\fbidi \fmodern\fcharset186\fprq1 @MS Mincho Baltic;}
+{\flomajor\f31508\fbidi \froman\fcharset238\fprq2 Times New Roman CE;}{\flomajor\f31509\fbidi \froman\fcharset204\fprq2 Times New Roman Cyr;}{\flomajor\f31511\fbidi \froman\fcharset161\fprq2 Times New Roman Greek;}
+{\flomajor\f31512\fbidi \froman\fcharset162\fprq2 Times New Roman Tur;}{\flomajor\f31513\fbidi \froman\fcharset177\fprq2 Times New Roman (Hebrew);}{\flomajor\f31514\fbidi \froman\fcharset178\fprq2 Times New Roman (Arabic);}
+{\flomajor\f31515\fbidi \froman\fcharset186\fprq2 Times New Roman Baltic;}{\flomajor\f31516\fbidi \froman\fcharset163\fprq2 Times New Roman (Vietnamese);}{\fdbmajor\f31518\fbidi \froman\fcharset238\fprq2 Times New Roman CE;}
+{\fdbmajor\f31519\fbidi \froman\fcharset204\fprq2 Times New Roman Cyr;}{\fdbmajor\f31521\fbidi \froman\fcharset161\fprq2 Times New Roman Greek;}{\fdbmajor\f31522\fbidi \froman\fcharset162\fprq2 Times New Roman Tur;}
+{\fdbmajor\f31523\fbidi \froman\fcharset177\fprq2 Times New Roman (Hebrew);}{\fdbmajor\f31524\fbidi \froman\fcharset178\fprq2 Times New Roman (Arabic);}{\fdbmajor\f31525\fbidi \froman\fcharset186\fprq2 Times New Roman Baltic;}
+{\fdbmajor\f31526\fbidi \froman\fcharset163\fprq2 Times New Roman (Vietnamese);}{\fhimajor\f31528\fbidi \froman\fcharset238\fprq2 Cambria CE;}{\fhimajor\f31529\fbidi \froman\fcharset204\fprq2 Cambria Cyr;}
+{\fhimajor\f31531\fbidi \froman\fcharset161\fprq2 Cambria Greek;}{\fhimajor\f31532\fbidi \froman\fcharset162\fprq2 Cambria Tur;}{\fhimajor\f31535\fbidi \froman\fcharset186\fprq2 Cambria Baltic;}
+{\fbimajor\f31538\fbidi \froman\fcharset238\fprq2 Times New Roman CE;}{\fbimajor\f31539\fbidi \froman\fcharset204\fprq2 Times New Roman Cyr;}{\fbimajor\f31541\fbidi \froman\fcharset161\fprq2 Times New Roman Greek;}
+{\fbimajor\f31542\fbidi \froman\fcharset162\fprq2 Times New Roman Tur;}{\fbimajor\f31543\fbidi \froman\fcharset177\fprq2 Times New Roman (Hebrew);}{\fbimajor\f31544\fbidi \froman\fcharset178\fprq2 Times New Roman (Arabic);}
+{\fbimajor\f31545\fbidi \froman\fcharset186\fprq2 Times New Roman Baltic;}{\fbimajor\f31546\fbidi \froman\fcharset163\fprq2 Times New Roman (Vietnamese);}{\flominor\f31548\fbidi \froman\fcharset238\fprq2 Times New Roman CE;}
+{\flominor\f31549\fbidi \froman\fcharset204\fprq2 Times New Roman Cyr;}{\flominor\f31551\fbidi \froman\fcharset161\fprq2 Times New Roman Greek;}{\flominor\f31552\fbidi \froman\fcharset162\fprq2 Times New Roman Tur;}
+{\flominor\f31553\fbidi \froman\fcharset177\fprq2 Times New Roman (Hebrew);}{\flominor\f31554\fbidi \froman\fcharset178\fprq2 Times New Roman (Arabic);}{\flominor\f31555\fbidi \froman\fcharset186\fprq2 Times New Roman Baltic;}
+{\flominor\f31556\fbidi \froman\fcharset163\fprq2 Times New Roman (Vietnamese);}{\fdbminor\f31558\fbidi \froman\fcharset238\fprq2 Times New Roman CE;}{\fdbminor\f31559\fbidi \froman\fcharset204\fprq2 Times New Roman Cyr;}
+{\fdbminor\f31561\fbidi \froman\fcharset161\fprq2 Times New Roman Greek;}{\fdbminor\f31562\fbidi \froman\fcharset162\fprq2 Times New Roman Tur;}{\fdbminor\f31563\fbidi \froman\fcharset177\fprq2 Times New Roman (Hebrew);}
+{\fdbminor\f31564\fbidi \froman\fcharset178\fprq2 Times New Roman (Arabic);}{\fdbminor\f31565\fbidi \froman\fcharset186\fprq2 Times New Roman Baltic;}{\fdbminor\f31566\fbidi \froman\fcharset163\fprq2 Times New Roman (Vietnamese);}
+{\fhiminor\f31568\fbidi \fswiss\fcharset238\fprq2 Calibri CE;}{\fhiminor\f31569\fbidi \fswiss\fcharset204\fprq2 Calibri Cyr;}{\fhiminor\f31571\fbidi \fswiss\fcharset161\fprq2 Calibri Greek;}{\fhiminor\f31572\fbidi \fswiss\fcharset162\fprq2 Calibri Tur;}
+{\fhiminor\f31575\fbidi \fswiss\fcharset186\fprq2 Calibri Baltic;}{\fbiminor\f31578\fbidi \froman\fcharset238\fprq2 Times New Roman CE;}{\fbiminor\f31579\fbidi \froman\fcharset204\fprq2 Times New Roman Cyr;}
+{\fbiminor\f31581\fbidi \froman\fcharset161\fprq2 Times New Roman Greek;}{\fbiminor\f31582\fbidi \froman\fcharset162\fprq2 Times New Roman Tur;}{\fbiminor\f31583\fbidi \froman\fcharset177\fprq2 Times New Roman (Hebrew);}
+{\fbiminor\f31584\fbidi \froman\fcharset178\fprq2 Times New Roman (Arabic);}{\fbiminor\f31585\fbidi \froman\fcharset186\fprq2 Times New Roman Baltic;}{\fbiminor\f31586\fbidi \froman\fcharset163\fprq2 Times New Roman (Vietnamese);}}
+{\colortbl;\red0\green0\blue0;\red0\green0\blue255;\red0\green255\blue255;\red0\green255\blue0;\red255\green0\blue255;\red255\green0\blue0;\red255\green255\blue0;\red255\green255\blue255;\red0\green0\blue128;\red0\green128\blue128;\red0\green128\blue0;
+\red128\green0\blue128;\red128\green0\blue0;\red128\green128\blue0;\red128\green128\blue128;\red192\green192\blue192;\chyperlink\ctint255\cshade255\red0\green0\blue255;\caccentone\ctint255\cshade255\red79\green129\blue189;}{\*\defchp \f31506\fs22 }
+{\*\defpap \ql \li0\ri0\sa200\sl276\slmult1\widctlpar\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0 }\noqfpromote {\stylesheet{\ql \li0\ri0\sa200\sl276\slmult1\widctlpar\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0 
+\rtlch\fcs1 \af31507\afs22\alang1025 \ltrch\fcs0 \f31506\fs22\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 \snext0 \sqformat \spriority0 \styrsid16456967 Normal;}{\*\cs10 \additive \ssemihidden \sunhideused \spriority1 Default Paragraph Font;}{\*
+\ts11\tsrowd\trftsWidthB3\trpaddl108\trpaddr108\trpaddfl3\trpaddft3\trpaddfb3\trpaddfr3\tblind0\tblindtype3\tscellwidthfts0\tsvertalt\tsbrdrt\tsbrdrl\tsbrdrb\tsbrdrr\tsbrdrdgl\tsbrdrdgr\tsbrdrh\tsbrdrv \ql \li0\ri0\sa200\sl276\slmult1
+\widctlpar\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0 \rtlch\fcs1 \af31507\afs22\alang1025 \ltrch\fcs0 \f31506\fs22\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 \snext11 \ssemihidden \sunhideused \sqformat Normal Table;}{
+\s15\ql \li0\ri0\widctlpar\tqc\tx4680\tqr\tx9360\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0 \rtlch\fcs1 \af31507\afs22\alang1025 \ltrch\fcs0 \f31506\fs22\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 
+\sbasedon0 \snext15 \slink16 \sunhideused \styrsid4535536 header;}{\*\cs16 \additive \rtlch\fcs1 \af0 \ltrch\fcs0 \sbasedon10 \slink15 \slocked \styrsid4535536 Header Char;}{\s17\ql \li0\ri0\widctlpar
+\tqc\tx4680\tqr\tx9360\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0 \rtlch\fcs1 \af31507\afs22\alang1025 \ltrch\fcs0 \f31506\fs22\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 
+\sbasedon0 \snext17 \slink18 \sunhideused \styrsid4535536 footer;}{\*\cs18 \additive \rtlch\fcs1 \af0 \ltrch\fcs0 \sbasedon10 \slink17 \slocked \styrsid4535536 Footer Char;}{
+\s19\ql \li0\ri0\widctlpar\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0 \rtlch\fcs1 \af38\afs16\alang1025 \ltrch\fcs0 \f38\fs16\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 
+\sbasedon0 \snext19 \slink20 \ssemihidden \sunhideused \styrsid4535536 Balloon Text;}{\*\cs20 \additive \rtlch\fcs1 \af38\afs16 \ltrch\fcs0 \f38\fs16 \sbasedon10 \slink19 \slocked \ssemihidden \styrsid4535536 Balloon Text Char;}{\*\cs21 \additive 
+\rtlch\fcs1 \af0 \ltrch\fcs0 \ul\cf17 \sbasedon10 \sunhideused \styrsid4535536 Hyperlink;}{\*\cs22 \additive \rtlch\fcs1 \af0 \ltrch\fcs0 \cf15 \sbasedon10 \ssemihidden \styrsid4535536 Placeholder Text;}{
+\s23\ql \li0\ri0\widctlpar\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0 \rtlch\fcs1 \af31507\afs20\alang1025 \ltrch\fcs0 \f31506\fs20\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 
+\sbasedon0 \snext23 \slink24 \ssemihidden \sunhideused \styrsid10829135 footnote text;}{\*\cs24 \additive \rtlch\fcs1 \af0\afs20 \ltrch\fcs0 \fs20 \sbasedon10 \slink23 \slocked \ssemihidden \styrsid10829135 Footnote Text Char;}{\*\cs25 \additive 
+\rtlch\fcs1 \af0 \ltrch\fcs0 \super \sbasedon10 \ssemihidden \sunhideused \styrsid10829135 footnote reference;}{\*\ts26\tsrowd\trbrdrt\brdrs\brdrw10 \trbrdrl\brdrs\brdrw10 \trbrdrb\brdrs\brdrw10 \trbrdrr\brdrs\brdrw10 \trbrdrh\brdrs\brdrw10 \trbrdrv
+\brdrs\brdrw10 \trftsWidthB3\trpaddl108\trpaddr108\trpaddfl3\trpaddft3\trpaddfb3\trpaddfr3\tblind0\tblindtype3\tscellwidthfts0\tsvertalt\tsbrdrt\tsbrdrl\tsbrdrb\tsbrdrr\tsbrdrdgl\tsbrdrdgr\tsbrdrh\tsbrdrv 
+\ql \li0\ri0\widctlpar\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0 \rtlch\fcs1 \af31507\afs22\alang1025 \ltrch\fcs0 \f31506\fs22\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 \sbasedon11 \snext26 \spriority59 \styrsid8288896 
+Table Grid;}{\s27\ql \li720\ri0\sa200\sl276\slmult1\widctlpar\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0\lin720\itap0\contextualspace \rtlch\fcs1 \af31507\afs22\alang1025 \ltrch\fcs0 \f31506\fs22\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 
+\sbasedon0 \snext27 \sqformat \spriority34 \styrsid10055055 List Paragraph;}{\s28\ql \li0\ri0\sa200\widctlpar\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0 \rtlch\fcs1 \ab\af31507\afs18\alang1025 \ltrch\fcs0 
+\b\f31506\fs18\cf18\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 \sbasedon0 \snext0 \sunhideused \sqformat \spriority35 \styrsid11105546 caption;}}{\*\listtable{\list\listtemplateid1249008552\listhybrid{\listlevel\levelnfc23\levelnfcn23\leveljc0
+\leveljcn0\levelfollow0\levelstartat1\levelspace360\levelindent0{\leveltext\leveltemplateid67698689\'01\u-3913 ?;}{\levelnumbers;}\f3\fbias0\hres0\chhres0 \fi-360\li720\lin720 }{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelfollow0
+\levelstartat1\lvltentative\levelspace360\levelindent0{\leveltext\leveltemplateid67698691\'01o;}{\levelnumbers;}\f2\fbias0\hres0\chhres0 \fi-360\li1440\lin1440 }{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelfollow0\levelstartat1\lvltentative
+\levelspace360\levelindent0{\leveltext\leveltemplateid67698693\'01\u-3929 ?;}{\levelnumbers;}\f10\fbias0\hres0\chhres0 \fi-360\li2160\lin2160 }{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelfollow0\levelstartat1\lvltentative\levelspace360
+\levelindent0{\leveltext\leveltemplateid67698689\'01\u-3913 ?;}{\levelnumbers;}\f3\fbias0\hres0\chhres0 \fi-360\li2880\lin2880 }{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelfollow0\levelstartat1\lvltentative\levelspace360\levelindent0
+{\leveltext\leveltemplateid67698691\'01o;}{\levelnumbers;}\f2\fbias0\hres0\chhres0 \fi-360\li3600\lin3600 }{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelfollow0\levelstartat1\lvltentative\levelspace360\levelindent0{\leveltext
+\leveltemplateid67698693\'01\u-3929 ?;}{\levelnumbers;}\f10\fbias0\hres0\chhres0 \fi-360\li4320\lin4320 }{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelfollow0\levelstartat1\lvltentative\levelspace360\levelindent0{\leveltext
+\leveltemplateid67698689\'01\u-3913 ?;}{\levelnumbers;}\f3\fbias0\hres0\chhres0 \fi-360\li5040\lin5040 }{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelfollow0\levelstartat1\lvltentative\levelspace360\levelindent0{\leveltext
+\leveltemplateid67698691\'01o;}{\levelnumbers;}\f2\fbias0\hres0\chhres0 \fi-360\li5760\lin5760 }{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelfollow0\levelstartat1\lvltentative\levelspace360\levelindent0{\leveltext\leveltemplateid67698693
+\'01\u-3929 ?;}{\levelnumbers;}\f10\fbias0\hres0\chhres0 \fi-360\li6480\lin6480 }{\listname ;}\listid73432867}{\list\listtemplateid1071396652\listhybrid{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelfollow0\levelstartat1\levelspace360
+\levelindent0{\leveltext\leveltemplateid67698689\'01\u-3913 ?;}{\levelnumbers;}\f3\fbias0\hres0\chhres0 \fi-360\li720\lin720 }{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelfollow0\levelstartat1\lvltentative\levelspace360\levelindent0
+{\leveltext\leveltemplateid67698691\'01o;}{\levelnumbers;}\f2\fbias0\hres0\chhres0 \fi-360\li1440\lin1440 }{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelfollow0\levelstartat1\lvltentative\levelspace360\levelindent0{\leveltext
+\leveltemplateid67698693\'01\u-3929 ?;}{\levelnumbers;}\f10\fbias0\hres0\chhres0 \fi-360\li2160\lin2160 }{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelfollow0\levelstartat1\lvltentative\levelspace360\levelindent0{\leveltext
+\leveltemplateid67698689\'01\u-3913 ?;}{\levelnumbers;}\f3\fbias0\hres0\chhres0 \fi-360\li2880\lin2880 }{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelfollow0\levelstartat1\lvltentative\levelspace360\levelindent0{\leveltext
+\leveltemplateid67698691\'01o;}{\levelnumbers;}\f2\fbias0\hres0\chhres0 \fi-360\li3600\lin3600 }{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelfollow0\levelstartat1\lvltentative\levelspace360\levelindent0{\leveltext\leveltemplateid67698693
+\'01\u-3929 ?;}{\levelnumbers;}\f10\fbias0\hres0\chhres0 \fi-360\li4320\lin4320 }{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelfollow0\levelstartat1\lvltentative\levelspace360\levelindent0{\leveltext\leveltemplateid67698689
+\'01\u-3913 ?;}{\levelnumbers;}\f3\fbias0\hres0\chhres0 \fi-360\li5040\lin5040 }{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelfollow0\levelstartat1\lvltentative\levelspace360\levelindent0{\leveltext\leveltemplateid67698691
+\'01o;}{\levelnumbers;}\f2\fbias0\hres0\chhres0 \fi-360\li5760\lin5760 }{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelfollow0\levelstartat1\lvltentative\levelspace360\levelindent0{\leveltext\leveltemplateid67698693
+\'01\u-3929 ?;}{\levelnumbers;}\f10\fbias0\hres0\chhres0 \fi-360\li6480\lin6480 }{\listname ;}\listid169494399}{\list\listtemplateid-487930464\listhybrid{\listlevel\levelnfc0\levelnfcn0\leveljc0\leveljcn0\levelfollow0\levelstartat1\levelspace360
+\levelindent0{\leveltext\leveltemplateid67698705\'02\'00);}{\levelnumbers\'01;}\rtlch\fcs1 \af0 \ltrch\fcs0 \hres0\chhres0 \fi-360\li720\lin720 }{\listlevel\levelnfc4\levelnfcn4\leveljc0\leveljcn0\levelfollow0\levelstartat1\lvltentative\levelspace360
+\levelindent0{\leveltext\leveltemplateid67698713\'02\'01.;}{\levelnumbers\'01;}\rtlch\fcs1 \af0 \ltrch\fcs0 \hres0\chhres0 \fi-360\li1440\lin1440 }{\listlevel\levelnfc2\levelnfcn2\leveljc2\leveljcn2\levelfollow0\levelstartat1\lvltentative\levelspace360
+\levelindent0{\leveltext\leveltemplateid67698715\'02\'02.;}{\levelnumbers\'01;}\rtlch\fcs1 \af0 \ltrch\fcs0 \hres0\chhres0 \fi-180\li2160\lin2160 }{\listlevel\levelnfc0\levelnfcn0\leveljc0\leveljcn0\levelfollow0\levelstartat1\lvltentative\levelspace360
+\levelindent0{\leveltext\leveltemplateid67698703\'02\'03.;}{\levelnumbers\'01;}\rtlch\fcs1 \af0 \ltrch\fcs0 \hres0\chhres0 \fi-360\li2880\lin2880 }{\listlevel\levelnfc4\levelnfcn4\leveljc0\leveljcn0\levelfollow0\levelstartat1\lvltentative\levelspace360
+\levelindent0{\leveltext\leveltemplateid67698713\'02\'04.;}{\levelnumbers\'01;}\rtlch\fcs1 \af0 \ltrch\fcs0 \hres0\chhres0 \fi-360\li3600\lin3600 }{\listlevel\levelnfc2\levelnfcn2\leveljc2\leveljcn2\levelfollow0\levelstartat1\lvltentative\levelspace360
+\levelindent0{\leveltext\leveltemplateid67698715\'02\'05.;}{\levelnumbers\'01;}\rtlch\fcs1 \af0 \ltrch\fcs0 \hres0\chhres0 \fi-180\li4320\lin4320 }{\listlevel\levelnfc0\levelnfcn0\leveljc0\leveljcn0\levelfollow0\levelstartat1\lvltentative\levelspace360
+\levelindent0{\leveltext\leveltemplateid67698703\'02\'06.;}{\levelnumbers\'01;}\rtlch\fcs1 \af0 \ltrch\fcs0 \hres0\chhres0 \fi-360\li5040\lin5040 }{\listlevel\levelnfc4\levelnfcn4\leveljc0\leveljcn0\levelfollow0\levelstartat1\lvltentative\levelspace360
+\levelindent0{\leveltext\leveltemplateid67698713\'02\'07.;}{\levelnumbers\'01;}\rtlch\fcs1 \af0 \ltrch\fcs0 \hres0\chhres0 \fi-360\li5760\lin5760 }{\listlevel\levelnfc2\levelnfcn2\leveljc2\leveljcn2\levelfollow0\levelstartat1\lvltentative\levelspace360
+\levelindent0{\leveltext\leveltemplateid67698715\'02\'08.;}{\levelnumbers\'01;}\rtlch\fcs1 \af0 \ltrch\fcs0 \hres0\chhres0 \fi-180\li6480\lin6480 }{\listname ;}\listid1132862691}}{\*\listoverridetable{\listoverride\listid169494399\listoverridecount0\ls1}
+{\listoverride\listid73432867\listoverridecount0\ls2}{\listoverride\listid1132862691\listoverridecount0\ls3}}{\*\rsidtbl \rsid724479\rsid2255182\rsid2767955\rsid4260063\rsid4535536\rsid5051464\rsid5706211\rsid5843828\rsid7218132\rsid8152053\rsid8288896
+\rsid9897893\rsid9969477\rsid10055055\rsid10249050\rsid10829135\rsid11105546\rsid12662658\rsid12941695\rsid13331334\rsid14163426\rsid14225018\rsid14292078\rsid14556934\rsid16456967\rsid16539678}{\mmathPr\mmathFont34\mbrkBin0\mbrkBinSub0\msmallFrac0
+\mdispDef1\mlMargin0\mrMargin0\mdefJc1\mwrapIndent1440\mintLim0\mnaryLim1}{\info{\subject Subject is here}{\author Michael McCandless}{\keywords Keyword1 Keyword2}{\operator Michael McCandless}{\creatim\yr2011\mo8\dy29\hr5\min20}
+{\revtim\yr2011\mo8\dy30\hr6\min13}{\version30}{\edmins445}{\nofpages2}{\nofwords95}{\nofchars546}{\nofcharsws640}{\vern32771}}{\*\xmlnstbl {\xmlns1 http://schemas.microsoft.com/office/word/2003/wordml}}
+\paperw12240\paperh15840\margl1440\margr1440\margt1440\margb1440\gutter0\ltrsect 
+\widowctrl\ftnbj\aenddoc\trackmoves1\trackformatting1\donotembedsysfont1\relyonvml0\donotembedlingdata0\grfdocevents0\validatexml1\showplaceholdtext0\ignoremixedcontent0\saveinvalidxml0\showxmlerrors1\noxlattoyen
+\expshrtn\noultrlspc\dntblnsbdb\nospaceforul\formshade\horzdoc\dgmargin\dghspace180\dgvspace180\dghorigin1440\dgvorigin1440\dghshow1\dgvshow1
+\jexpand\viewkind1\viewscale150\pgbrdrhead\pgbrdrfoot\splytwnine\ftnlytwnine\htmautsp\nolnhtadjtbl\useltbaln\alntblind\lytcalctblwd\lyttblrtgr\lnbrkrule\nobrkwrptbl\snaptogridincell\allowfieldendsel\wrppunct
+\asianbrkrule\rsidroot4535536\newtblstyruls\nogrowautofit\usenormstyforlist\noindnmbrts\felnbrelev\nocxsptable\indrlsweleven\noafcnsttbl\afelev\utinl\hwelev\spltpgpar\notcvasp\notbrkcnstfrctbl\notvatxbx\krnprsnet\cachedcolbal \nouicompat \fet0
+{\*\wgrffmtfilter 2450}\nofeaturethrottle1\ilfomacatclnup0{\*\ftnsep \ltrpar \pard\plain \ltrpar\ql \li0\ri0\widctlpar\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0\pararsid4535536 \rtlch\fcs1 \af31507\afs22\alang1025 \ltrch\fcs0 
+\f31506\fs22\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 {\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid4535536 \chftnsep 
+\par }}{\*\ftnsepc \ltrpar \pard\plain \ltrpar\ql \li0\ri0\widctlpar\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0\pararsid4535536 \rtlch\fcs1 \af31507\afs22\alang1025 \ltrch\fcs0 \f31506\fs22\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 
+{\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid4535536 \chftnsepc 
+\par }}{\*\aftnsep \ltrpar \pard\plain \ltrpar\ql \li0\ri0\widctlpar\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0\pararsid4535536 \rtlch\fcs1 \af31507\afs22\alang1025 \ltrch\fcs0 \f31506\fs22\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 
+{\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid4535536 \chftnsep 
+\par }}{\*\aftnsepc \ltrpar \pard\plain \ltrpar\ql \li0\ri0\widctlpar\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0\pararsid4535536 \rtlch\fcs1 \af31507\afs22\alang1025 \ltrch\fcs0 
+\f31506\fs22\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 {\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid4535536 \chftnsepc 
+\par }}\ltrpar \sectd \ltrsect\linex0\endnhere\sectlinegrid360\sectdefaultcl\sectrsid16456967\sftnbj {\headerr \ltrpar \pard\plain \ltrpar\s15\ql \li0\ri0\widctlpar\tqc\tx4680\tqr\tx9360\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0 
+\rtlch\fcs1 \af31507\afs22\alang1025 \ltrch\fcs0 \f31506\fs22\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 {\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid4535536 This is the header text}{\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid12662658 .}{\rtlch\fcs1 
+\af31507 \ltrch\fcs0 \insrsid4535536 
+\par 
+\par }}{\footerr \ltrpar \pard\plain \ltrpar\s17\ql \li0\ri0\widctlpar\tqc\tx4680\tqr\tx9360\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0 \rtlch\fcs1 \af31507\afs22\alang1025 \ltrch\fcs0 
+\f31506\fs22\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 {\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid4535536 This is the footer text.
+\par 
+\par }}{\*\pnseclvl1\pnucrm\pnstart1\pnindent720\pnhang {\pntxta .}}{\*\pnseclvl2\pnucltr\pnstart1\pnindent720\pnhang {\pntxta .}}{\*\pnseclvl3\pndec\pnstart1\pnindent720\pnhang {\pntxta .}}{\*\pnseclvl4\pnlcltr\pnstart1\pnindent720\pnhang {\pntxta )}}
+{\*\pnseclvl5\pndec\pnstart1\pnindent720\pnhang {\pntxtb (}{\pntxta )}}{\*\pnseclvl6\pnlcltr\pnstart1\pnindent720\pnhang {\pntxtb (}{\pntxta )}}{\*\pnseclvl7\pnlcrm\pnstart1\pnindent720\pnhang {\pntxtb (}{\pntxta )}}{\*\pnseclvl8
+\pnlcltr\pnstart1\pnindent720\pnhang {\pntxtb (}{\pntxta )}}{\*\pnseclvl9\pnlcrm\pnstart1\pnindent720\pnhang {\pntxtb (}{\pntxta )}}\pard\plain \ltrpar\ql \li0\ri0\sa200\sl276\slmult1
+\widctlpar\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0 \rtlch\fcs1 \af31507\afs22\alang1025 \ltrch\fcs0 \f31506\fs22\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 {\rtlch\fcs1 \af31507 \ltrch\fcs0 
+\lang1024\langfe1024\noproof\langfenp1028\insrsid4535536 {\shp{\*\shpinst\shpleft4866\shptop1990\shpright8593\shpbottom2658\shpfhdr0\shpbxcolumn\shpbxignore\shpbypara\shpbyignore\shpwr3\shpwrk0\shpfblwtxt0\shpz0\shplid1026
+{\sp{\sn shapeType}{\sv 202}}{\sp{\sn fFlipH}{\sv 0}}{\sp{\sn fFlipV}{\sv 0}}{\sp{\sn lTxid}{\sv 65536}}{\sp{\sn hspNext}{\sv 1026}}{\sp{\sn fFitShapeToText}{\sv 1}}{\sp{\sn dhgt}{\sv 251660288}}{\sp{\sn pctHoriz}{\sv 400}}{\sp{\sn pctVert}{\sv 200}}
+{\sp{\sn sizerelh}{\sv 0}}{\sp{\sn sizerelv}{\sv 0}}{\sp{\sn fLayoutInCell}{\sv 1}}{\shptxt \ltrpar \pard\plain \ltrpar\ql \li0\ri0\sa200\sl276\slmult1\widctlpar\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0 \rtlch\fcs1 
+\af31507\afs22\alang1025 \ltrch\fcs0 \f31506\fs22\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 {\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid4535536 Here is a text box
+\par }}}{\shprslt{\*\do\dobxcolumn\dobypara\dodhgt8192\dptxbx\dptxlrtb{\dptxbxtext\ltrpar \pard\plain \ltrpar\ql \li0\ri0\sa200\sl276\slmult1\widctlpar\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0 \rtlch\fcs1 \af31507\afs22\alang1025 
+\ltrch\fcs0 \f31506\fs22\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 {\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid4535536 Here is a text box
+\par }}\dpx4866\dpy1990\dpxsize3727\dpysize668\dpfillfgcr255\dpfillfgcg255\dpfillfgcb255\dpfillbgcr255\dpfillbgcg255\dpfillbgcb255\dpfillpat1\dplinew15\dplinecor0\dplinecog0\dplinecob0}}}}{\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid4535536 Footnote appears here}
+{\rtlch\fcs1 \af31507 \ltrch\fcs0 \cs25\super\insrsid10829135 \chftn {\footnote \ltrpar \pard\plain \ltrpar\s23\ql \li0\ri0\widctlpar\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0 \rtlch\fcs1 \af31507\afs20\alang1025 \ltrch\fcs0 
+\f31506\fs20\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 {\rtlch\fcs1 \af31507 \ltrch\fcs0 \cs25\super\insrsid10829135 \chftn }{\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid10829135  This is a footnote.}}}{\rtlch\fcs1 \af31507 \ltrch\fcs0 
+\insrsid14292078 
+\par }{\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid14556934 
+\par }{\rtlch\fcs1 \af31507 \ltrch\fcs0 \b\insrsid14556934\charrsid14556934 Bold}{\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid14556934  }{\rtlch\fcs1 \af31507 \ltrch\fcs0 \i\insrsid14556934\charrsid14556934 italic}{\rtlch\fcs1 \af31507 \ltrch\fcs0 
+\insrsid14556934  }{\rtlch\fcs1 \af31507 \ltrch\fcs0 \ul\insrsid14556934\charrsid14556934 underline}{\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid14556934  }{\rtlch\fcs1 \af31507 \ltrch\fcs0 \super\insrsid14556934\charrsid14556934 superscript}{\rtlch\fcs1 
+\af31507 \ltrch\fcs0 \insrsid14556934  }{\rtlch\fcs1 \af31507 \ltrch\fcs0 \sub\insrsid14556934\charrsid14556934 subscript}{\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid14556934 
+\par }\pard \ltrpar\ql \li0\ri0\sa200\sl276\slmult1\widctlpar\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0\pararsid10055055 {\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid14292078 
+\par }{\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid10055055 Here is a list:
+\par {\listtext\pard\plain\ltrpar \s27 \rtlch\fcs1 \af31507\afs22 \ltrch\fcs0 \f3\fs22\insrsid10055055 \loch\af3\dbch\af31506\hich\f3 \'b7\tab}}\pard\plain \ltrpar\s27\ql \fi-360\li720\ri0\sa200\sl276\slmult1
+\widctlpar\wrapdefault\aspalpha\aspnum\faauto\ls2\adjustright\rin0\lin720\itap0\pararsid10055055\contextualspace \rtlch\fcs1 \af31507\afs22\alang1025 \ltrch\fcs0 \f31506\fs22\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 {\rtlch\fcs1 \af31507 
+\ltrch\fcs0 \insrsid10055055 Bullet 1
+\par {\listtext\pard\plain\ltrpar \s27 \rtlch\fcs1 \af31507\afs22 \ltrch\fcs0 \f3\fs22\insrsid10055055 \loch\af3\dbch\af31506\hich\f3 \'b7\tab}Bullet 2
+\par {\listtext\pard\plain\ltrpar \s27 \rtlch\fcs1 \af31507\afs22 \ltrch\fcs0 \f3\fs22\insrsid10055055 \loch\af3\dbch\af31506\hich\f3 \'b7\tab}Bullet 3
+\par }\pard\plain \ltrpar\ql \li0\ri0\sa200\sl276\slmult1\widctlpar\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0\pararsid10055055 \rtlch\fcs1 \af31507\afs22\alang1025 \ltrch\fcs0 \f31506\fs22\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 
+{\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid10055055 Here is a numbered list:
+\par {\listtext\pard\plain\ltrpar \s27 \rtlch\fcs1 \af31507\afs22 \ltrch\fcs0 \f31506\fs22\insrsid10055055 \hich\af31506\dbch\af31506\loch\f31506 1)\tab}}\pard\plain \ltrpar\s27\ql \fi-360\li720\ri0\sa200\sl276\slmult1
+\widctlpar\wrapdefault\aspalpha\aspnum\faauto\ls3\adjustright\rin0\lin720\itap0\pararsid10055055\contextualspace \rtlch\fcs1 \af31507\afs22\alang1025 \ltrch\fcs0 \f31506\fs22\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 {\rtlch\fcs1 \af31507 
+\ltrch\fcs0 \insrsid10055055 Number bullet 1
+\par {\listtext\pard\plain\ltrpar \s27 \rtlch\fcs1 \af31507\afs22 \ltrch\fcs0 \f31506\fs22\insrsid10055055 \hich\af31506\dbch\af31506\loch\f31506 2)\tab}Number bullet 2
+\par {\listtext\pard\plain\ltrpar \s27 \rtlch\fcs1 \af31507\afs22 \ltrch\fcs0 \f31506\fs22\insrsid10055055 \hich\af31506\dbch\af31506\loch\f31506 3)\tab}Number bullet 3
+\par }\pard\plain \ltrpar\ql \li0\ri0\sa200\sl276\slmult1\widctlpar\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0 \rtlch\fcs1 \af31507\afs22\alang1025 \ltrch\fcs0 \f31506\fs22\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 {\rtlch\fcs1 
+\af31507 \ltrch\fcs0 \insrsid10829135 
+\par }{\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid4535536\charrsid4535536  }{\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid4535536 }{\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid4535536 Keyword1 Keyword2}{\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid4535536 }{\rtlch\fcs1 
+\af31507 \ltrch\fcs0 \insrsid15481255 
+\par }{\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid4535536 
+\par }{\field{\*\fldinst {\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid4535536  HYPERLINK "http://tika.apache.org" }{\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid4535536 {\*\datafield 
+00d0c9ea79f9bace118c8200aa004ba90b0200000003000000e0c9ea79f9bace118c8200aa004ba90b4800000068007400740070003a002f002f00740069006b0061002e006100700061006300680065002e006f00720067002f000000795881f43b1d7f48af2c825dc485276300000000a5ab0000}}}{\fldrslt {
+\rtlch\fcs1 \af31507 \ltrch\fcs0 \cs21\ul\cf17\insrsid4535536\charrsid4535536 This is a hyperlink}}}\sectd \ltrsect\linex0\endnhere\sectlinegrid360\sectdefaultcl\sectrsid16456967\sftnbj {\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid14292078 
+\par 
+\par }{\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid4535536\charrsid4535536  }{\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid14292078 }{\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid14292078 Subject is here}{\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid14292078 }{\rtlch\fcs1 
+\af31507 \ltrch\fcs0 \insrsid4535536 
+\par }{\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid8288896 
+\par \ltrrow}\trowd \irow0\irowband0\ltrrow\ts26\trgaph108\trleft-108\trbrdrt\brdrs\brdrw10 \trbrdrl\brdrs\brdrw10 \trbrdrb\brdrs\brdrw10 \trbrdrr\brdrs\brdrw10 \trbrdrh\brdrs\brdrw10 \trbrdrv\brdrs\brdrw10 
+\trftsWidth1\trftsWidthB3\trautofit1\trpaddl108\trpaddr108\trpaddfl3\trpaddft3\trpaddfb3\trpaddfr3\tblrsid8288896\tbllkhdrrows\tbllkhdrcols\tbllknocolband\tblind0\tblindtype3 \clvertalt\clbrdrt\brdrs\brdrw10 \clbrdrl\brdrs\brdrw10 \clbrdrb\brdrs\brdrw10 
+\clbrdrr\brdrs\brdrw10 \cltxlrtb\clftsWidth3\clwWidth3192\clshdrawnil \cellx3084\clvertalt\clbrdrt\brdrs\brdrw10 \clbrdrl\brdrs\brdrw10 \clbrdrb\brdrs\brdrw10 \clbrdrr\brdrs\brdrw10 \cltxlrtb\clftsWidth3\clwWidth3192\clshdrawnil \cellx6276\clvertalt
+\clbrdrt\brdrs\brdrw10 \clbrdrl\brdrs\brdrw10 \clbrdrb\brdrs\brdrw10 \clbrdrr\brdrs\brdrw10 \cltxlrtb\clftsWidth3\clwWidth3192\clshdrawnil \cellx9468\pard\plain \ltrpar
+\ql \li0\ri0\widctlpar\intbl\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0\lin0\yts26 \rtlch\fcs1 \af31507\afs22\alang1025 \ltrch\fcs0 \f31506\fs22\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 {\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid8288896 
+Row 1 Col 1\cell Row 1 Col 2\cell Row 1 Col 3\cell }\pard\plain \ltrpar\ql \li0\ri0\sa200\sl276\slmult1\widctlpar\intbl\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0\lin0 \rtlch\fcs1 \af31507\afs22\alang1025 \ltrch\fcs0 
+\f31506\fs22\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 {\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid8288896 \trowd \irow0\irowband0\ltrrow\ts26\trgaph108\trleft-108\trbrdrt\brdrs\brdrw10 \trbrdrl\brdrs\brdrw10 \trbrdrb\brdrs\brdrw10 \trbrdrr
+\brdrs\brdrw10 \trbrdrh\brdrs\brdrw10 \trbrdrv\brdrs\brdrw10 \trftsWidth1\trftsWidthB3\trautofit1\trpaddl108\trpaddr108\trpaddfl3\trpaddft3\trpaddfb3\trpaddfr3\tblrsid8288896\tbllkhdrrows\tbllkhdrcols\tbllknocolband\tblind0\tblindtype3 \clvertalt\clbrdrt
+\brdrs\brdrw10 \clbrdrl\brdrs\brdrw10 \clbrdrb\brdrs\brdrw10 \clbrdrr\brdrs\brdrw10 \cltxlrtb\clftsWidth3\clwWidth3192\clshdrawnil \cellx3084\clvertalt\clbrdrt\brdrs\brdrw10 \clbrdrl\brdrs\brdrw10 \clbrdrb\brdrs\brdrw10 \clbrdrr\brdrs\brdrw10 
+\cltxlrtb\clftsWidth3\clwWidth3192\clshdrawnil \cellx6276\clvertalt\clbrdrt\brdrs\brdrw10 \clbrdrl\brdrs\brdrw10 \clbrdrb\brdrs\brdrw10 \clbrdrr\brdrs\brdrw10 \cltxlrtb\clftsWidth3\clwWidth3192\clshdrawnil \cellx9468\row \ltrrow}\pard\plain \ltrpar
+\ql \li0\ri0\widctlpar\intbl\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0\lin0\yts26 \rtlch\fcs1 \af31507\afs22\alang1025 \ltrch\fcs0 \f31506\fs22\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 {\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid8288896 
+Row 2 Col 1\cell Row 2 Col 2\cell Row 2 Col 3\cell }\pard\plain \ltrpar\ql \li0\ri0\sa200\sl276\slmult1\widctlpar\intbl\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0\lin0 \rtlch\fcs1 \af31507\afs22\alang1025 \ltrch\fcs0 
+\f31506\fs22\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 {\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid8288896 \trowd \irow1\irowband1\lastrow \ltrrow\ts26\trgaph108\trleft-108\trbrdrt\brdrs\brdrw10 \trbrdrl\brdrs\brdrw10 \trbrdrb\brdrs\brdrw10 \trbrdrr
+\brdrs\brdrw10 \trbrdrh\brdrs\brdrw10 \trbrdrv\brdrs\brdrw10 \trftsWidth1\trftsWidthB3\trautofit1\trpaddl108\trpaddr108\trpaddfl3\trpaddft3\trpaddfb3\trpaddfr3\tblrsid8288896\tbllkhdrrows\tbllkhdrcols\tbllknocolband\tblind0\tblindtype3 \clvertalt\clbrdrt
+\brdrs\brdrw10 \clbrdrl\brdrs\brdrw10 \clbrdrb\brdrs\brdrw10 \clbrdrr\brdrs\brdrw10 \cltxlrtb\clftsWidth3\clwWidth3192\clshdrawnil \cellx3084\clvertalt\clbrdrt\brdrs\brdrw10 \clbrdrl\brdrs\brdrw10 \clbrdrb\brdrs\brdrw10 \clbrdrr\brdrs\brdrw10 
+\cltxlrtb\clftsWidth3\clwWidth3192\clshdrawnil \cellx6276\clvertalt\clbrdrt\brdrs\brdrw10 \clbrdrl\brdrs\brdrw10 \clbrdrb\brdrs\brdrw10 \clbrdrr\brdrs\brdrw10 \cltxlrtb\clftsWidth3\clwWidth3192\clshdrawnil \cellx9468\row }\pard \ltrpar
+\ql \li0\ri0\sa200\sl276\slmult1\widctlpar\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0 {\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid8288896 
+\par }{\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid724479 Suddenly some }{\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid5706211 J}{\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid724479 apanese text:}{\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid9969477 
+\par }{\rtlch\fcs1 \af11 \ltrch\fcs0 \loch\af11\hich\af11\dbch\af11\insrsid724479\charrsid724479 \loch\af11\hich\af11\dbch\f11 \uc2\u12478\'83\'5d\u12523\'83\'8b\u12466\'83\'51\u12392\'82\'c6\u23614\'94\'f6\u23822\'8d\'e8\u12289\'81\'41\u28129\'92\'57\u12293
+\'81\'58\u12392\'82\'c6\u26368\'8d\'c5\u26399\'8a\'fa}{\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid9969477 
+\par }{\rtlch\fcs1 \af15 \ltrch\fcs0 \lang1033\langfe1041\loch\af15\hich\af15\dbch\af15\langfenp1041\insrsid5843828 \loch\af15\hich\af15\dbch\f15 \uc2\u-248\'81\'69\u-217\'82\'66\u-216\'82\'67\u-207\'82\'70\u-247\'81\'6a}{\rtlch\fcs1 \af31507 \ltrch\fcs0 
+\insrsid9969477 
+\par }{\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid5706211 And then some Gothic text:
+\par }\pard \ltrpar\ql \li0\ri0\nowidctlpar\wrapdefault\faauto\rin0\lin0\itap0\pararsid14163426 {\rtlch\fcs1 \af1\afs20 \ltrch\fcs0 \f1\fs20\insrsid14163426 \u-10240\'3f\u-8398\'3f\u-10240\'3f\u-8385\'3f\u-10240\'3f\u-8380\'3f\u-10240\'3f\u-8391\'3f\u-10240
+\'3f\u-8381\'3f\u-10240\'3f\u-8390\'3f}{\rtlch\fcs1 \af1\afs20 \ltrch\fcs0 \f1\fs20\insrsid14163426 
+\par }\pard \ltrpar\ql \li0\ri0\sa200\sl276\slmult1\widctlpar\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0 {\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid9969477 
+\par }{\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid7218132 Here is a citation:}{\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid9969477 
+\par }{\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid12941695 }{\field{\*\fldinst {\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid12941695  CITATION Kra \\l 1033 }}{\fldrslt {\rtlch\fcs1 \af31507 \ltrch\fcs0 \lang1024\langfe1024\noproof\insrsid12941695 (Kramer)}}}
+\sectd \ltrsect\linex0\endnhere\sectlinegrid360\sectdefaultcl\sectrsid16456967\sftnbj {\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid12941695 }{\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid9969477 
+\par }{\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid11105546 
+\par }\pard\plain \ltrpar\s28\ql \li0\ri0\sa200\widctlpar\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0\pararsid11105546 \rtlch\fcs1 \ab\af31507\afs18\alang1025 \ltrch\fcs0 \b\f31506\fs18\cf18\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 
+{\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid11105546 Figure }{\field{\*\fldinst {\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid11105546  SEQ Figure \\* ARABIC }}{\fldrslt {\rtlch\fcs1 \af31507 \ltrch\fcs0 \lang1024\langfe1024\noproof\insrsid11105546 1}}}
+\sectd \ltrsect\linex0\endnhere\sectlinegrid360\sectdefaultcl\sectrsid16456967\sftnbj {\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid11105546  This is a caption for Figure 1
+\par }\pard\plain \ltrpar\ql \li0\ri0\sa200\sl276\slmult1\widctlpar\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0\pararsid8152053 \rtlch\fcs1 \af31507\afs22\alang1025 \ltrch\fcs0 \f31506\fs22\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 {
+\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid8152053 
+\par 
+\par }{\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid5051464 \sect }\sectd \ltrsect\sbknone\linex0\cols2\endnhere\sectlinegrid360\sectdefaultcl\sectrsid5051464\sftnbj \pard\plain \ltrpar\ql \li0\ri0\sa200\sl276\slmult1
+\widctlpar\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0\pararsid5051464 \rtlch\fcs1 \af31507\afs22\alang1025 \ltrch\fcs0 \f31506\fs22\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 {\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid5051464 
+Row 1 column 1
+\par Row 2 column 1
+\par }\pard \ltrpar\ql \li0\ri0\sa200\sl276\slmult1\widctlpar\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0\pararsid8152053 {\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid5051464 Row 1 column 2
+\par Row 2 column 2
+\par \sect }\sectd \ltrsect\sbknone\linex0\endnhere\sectlinegrid360\sectdefaultcl\sectrsid5051464\sftnbj \pard\plain \ltrpar\ql \li0\ri0\sa200\sl276\slmult1\widctlpar\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0\pararsid8152053 \rtlch\fcs1 
+\af31507\afs22\alang1025 \ltrch\fcs0 \f31506\fs22\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 {\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid5051464\charrsid8152053 
+\par }{\*\themedata 504b030414000600080000002100828abc13fa0000001c020000130000005b436f6e74656e745f54797065735d2e786d6cac91cb6ac3301045f785fe83d0b6d8
+72ba28a5d8cea249777d2cd20f18e4b12d6a8f843409c9df77ecb850ba082d74231062ce997b55ae8fe3a00e1893f354e9555e6885647de3a8abf4fbee29bbd7
+2a3150038327acf409935ed7d757e5ee14302999a654e99e393c18936c8f23a4dc072479697d1c81e51a3b13c07e4087e6b628ee8cf5c4489cf1c4d075f92a0b
+44d7a07a83c82f308ac7b0a0f0fbf90c2480980b58abc733615aa2d210c2e02cb04430076a7ee833dfb6ce62e3ed7e14693e8317d8cd0433bf5c60f53fea2fe7
+065bd80facb647e9e25c7fc421fd2ddb526b2e9373fed4bb902e182e97b7b461e6bfad3f010000ffff0300504b030414000600080000002100a5d6a7e7c00000
+00360100000b0000005f72656c732f2e72656c73848fcf6ac3300c87ef85bd83d17d51d2c31825762fa590432fa37d00e1287f68221bdb1bebdb4fc7060abb08
+84a4eff7a93dfeae8bf9e194e720169aaa06c3e2433fcb68e1763dbf7f82c985a4a725085b787086a37bdbb55fbc50d1a33ccd311ba548b63095120f88d94fbc
+52ae4264d1c910d24a45db3462247fa791715fd71f989e19e0364cd3f51652d73760ae8fa8c9ffb3c330cc9e4fc17faf2ce545046e37944c69e462a1a82fe353
+bd90a865aad41ed0b5b8f9d6fd010000ffff0300504b0304140006000800000021006b799616830000008a0000001c0000007468656d652f7468656d652f7468
+656d654d616e616765722e786d6c0ccc4d0ac3201040e17da17790d93763bb284562b2cbaebbf600439c1a41c7a0d29fdbd7e5e38337cedf14d59b4b0d592c9c
+070d8a65cd2e88b7f07c2ca71ba8da481cc52c6ce1c715e6e97818c9b48d13df49c873517d23d59085adb5dd20d6b52bd521ef2cdd5eb9246a3d8b4757e8d3f7
+29e245eb2b260a0238fd010000ffff0300504b03041400060008000000210096b5ade296060000501b0000160000007468656d652f7468656d652f7468656d65
+312e786d6cec594f6fdb3614bf0fd87720746f6327761a07758ad8b19b2d4d1bc46e871e698996d850a240d2497d1bdae38001c3ba618715d86d87615b8116d8
+a5fb34d93a6c1dd0afb0475292c5585e9236d88aad3e2412f9e3fbff1e1fa9abd7eec70c1d1221294fda5efd72cd4324f1794093b0eddd1ef62fad79482a9c04
+98f184b4bd2991deb58df7dfbb8ad755446282607d22d771db8b944ad79796a40fc3585ee62949606ecc458c15bc8a702910f808e8c66c69b9565b5d8a314d3c
+94e018c8de1a8fa94fd05093f43672e23d06af89927ac06762a049136785c10607758d9053d965021d62d6f6804fc08f86e4bef210c352c144dbab999fb7b471
+7509af678b985ab0b6b4ae6f7ed9ba6c4170b06c788a705430adf71bad2b5b057d03606a1ed7ebf5babd7a41cf00b0ef83a6569632cd467faddec9699640f671
+9e76b7d6ac355c7c89feca9cccad4ea7d36c65b258a206641f1b73f8b5da6a6373d9c11b90c537e7f08dce66b7bbeae00dc8e257e7f0fd2badd5868b37a088d1
+e4600ead1ddaef67d40bc898b3ed4af81ac0d76a197c86826828a24bb318f3442d8ab518dfe3a20f000d6458d104a9694ac6d88728eee2782428d60cf03ac1a5
+193be4cbb921cd0b495fd054b5bd0f530c1931a3f7eaf9f7af9e3f45c70f9e1d3ff8e9f8e1c3e3073f5a42ceaa6d9c84e5552fbffdeccfc71fa33f9e7ef3f2d1
+17d57859c6fffac327bffcfc793510d26726ce8b2f9ffcf6ecc98baf3efdfdbb4715f04d814765f890c644a29be408edf3181433567125272371be15c308d3f2
+8acd249438c19a4b05fd9e8a1cf4cd296699771c393ac4b5e01d01e5a30a787d72cf1178108989a2159c77a2d801ee72ce3a5c545a6147f32a99793849c26ae6
+6252c6ed637c58c5bb8b13c7bfbd490a75330f4b47f16e441c31f7184e140e494214d273fc80900aedee52ead87597fa824b3e56e82e451d4c2b4d32a423279a
+668bb6690c7e9956e90cfe766cb37b077538abd27a8b1cba48c80acc2a841f12e698f13a9e281c57911ce298950d7e03aba84ac8c154f8655c4f2af074481847
+bd804859b5e696007d4b4edfc150b12addbecba6b18b148a1e54d1bc81392f23b7f84137c2715a851dd0242a633f900710a218ed715505dfe56e86e877f0034e
+16bafb0e258ebb4faf06b769e888340b103d3311da9750aa9d0a1cd3e4efca31a3508f6d0c5c5c398602f8e2ebc71591f5b616e24dd893aa3261fb44f95d843b
+5974bb5c04f4edafb95b7892ec1108f3f98de75dc97d5772bdff7cc95d94cf672db4b3da0a6557f70db629362d72bcb0431e53c6066acac80d699a6409fb44d0
+8741bdce9c0e4971624a2378cceaba830b05366b90e0ea23aaa241845368b0eb9e2612ca8c742851ca251ceccc70256d8d87265dd96361531f186c3d9058edf2
+c00eafe8e1fc5c509031bb4d680e9f39a3154de0accc56ae644441edd76156d7429d995bdd88664a9dc3ad50197c38af1a0c16d684060441db02565e85f3b966
+0d0713cc48a0ed6ef7dedc2dc60b17e92219e180643ed27acffba86e9c94c78ab90980d8a9f0913ee49d62b512b79626fb06dccee2a432bbc60276b9f7dec44b
+7904cfbca4f3f6443ab2a49c9c2c41476dafd55c6e7ac8c769db1bc399161ee314bc2e75cf8759081743be1236ec4f4d6693e5336fb672c5dc24a8c33585b5fb
+9cc24e1d4885545b58463634cc5416022cd19cacfccb4d30eb45296023fd35a458598360f8d7a4003bbaae25e331f155d9d9a5116d3bfb9a95523e51440ca2e0
+088dd844ec6370bf0e55d027a012ae264c45d02f708fa6ad6da6dce29c255df9f6cae0ec38666984b372ab5334cf640b37795cc860de4ae2816e95b21be5ceaf
+8a49f90b52a51cc6ff3355f47e0237052b81f6800fd7b802239daf6d8f0b1571a8426944fdbe80c6c1d40e8816b88b8569082ab84c36ff0539d4ff6dce591a26
+ade1c0a7f669880485fd484582903d284b26fa4e2156cff62e4b9265844c4495c495a9157b440e091bea1ab8aaf7760f4510eaa69a6465c0e04ec69ffb9e65d0
+28d44d4e39df9c1a52ecbd3607fee9cec7263328e5d661d3d0e4f62f44acd855ed7ab33cdf7bcb8ae889599bd5c8b3029895b6825696f6af29c239b75a5bb1e6
+345e6ee6c28117e73586c1a2214ae1be07e93fb0ff51e133fb65426fa843be0fb515c187064d0cc206a2fa926d3c902e907670048d931db4c1a44959d366ad93
+b65abe595f70a75bf03d616c2dd959fc7d4e6317cd99cbcec9c58b34766661c7d6766ca1a9c1b327531486c6f941c638c67cd22a7f75e2a37be0e82db8df9f30
+254d30c1372581a1f51c983c80e4b71ccdd28dbf000000ffff0300504b0304140006000800000021000dd1909fb60000001b010000270000007468656d652f74
+68656d652f5f72656c732f7468656d654d616e616765722e786d6c2e72656c73848f4d0ac2301484f78277086f6fd3ba109126dd88d0add40384e4350d363f24
+51eced0dae2c082e8761be9969bb979dc9136332de3168aa1a083ae995719ac16db8ec8e4052164e89d93b64b060828e6f37ed1567914b284d262452282e3198
+720e274a939cd08a54f980ae38a38f56e422a3a641c8bbd048f7757da0f19b017cc524bd62107bd5001996509affb3fd381a89672f1f165dfe514173d9850528
+a2c6cce0239baa4c04ca5bbabac4df000000ffff0300504b01022d0014000600080000002100828abc13fa0000001c0200001300000000000000000000000000
+000000005b436f6e74656e745f54797065735d2e786d6c504b01022d0014000600080000002100a5d6a7e7c0000000360100000b000000000000000000000000
+002b0100005f72656c732f2e72656c73504b01022d00140006000800000021006b799616830000008a0000001c00000000000000000000000000140200007468
+656d652f7468656d652f7468656d654d616e616765722e786d6c504b01022d001400060008000000210096b5ade296060000501b000016000000000000000000
+00000000d10200007468656d652f7468656d652f7468656d65312e786d6c504b01022d00140006000800000021000dd1909fb60000001b010000270000000000
+00000000000000009b0900007468656d652f7468656d652f5f72656c732f7468656d654d616e616765722e786d6c2e72656c73504b050600000000050005005d010000960a00000000}
+{\*\colorschememapping 3c3f786d6c2076657273696f6e3d22312e302220656e636f64696e673d225554462d3822207374616e64616c6f6e653d22796573223f3e0d0a3c613a636c724d
+617020786d6c6e733a613d22687474703a2f2f736368656d61732e6f70656e786d6c666f726d6174732e6f72672f64726177696e676d6c2f323030362f6d6169
+6e22206267313d226c743122207478313d22646b3122206267323d226c743222207478323d22646b322220616363656e74313d22616363656e74312220616363
+656e74323d22616363656e74322220616363656e74333d22616363656e74332220616363656e74343d22616363656e74342220616363656e74353d22616363656e74352220616363656e74363d22616363656e74362220686c696e6b3d22686c696e6b2220666f6c486c696e6b3d22666f6c486c696e6b222f3e}
+{\*\latentstyles\lsdstimax267\lsdlockeddef0\lsdsemihiddendef1\lsdunhideuseddef1\lsdqformatdef0\lsdprioritydef99{\lsdlockedexcept \lsdsemihidden0 \lsdunhideused0 \lsdqformat1 \lsdpriority0 \lsdlocked0 Normal;
+\lsdsemihidden0 \lsdunhideused0 \lsdqformat1 \lsdpriority9 \lsdlocked0 heading 1;\lsdqformat1 \lsdpriority9 \lsdlocked0 heading 2;\lsdqformat1 \lsdpriority9 \lsdlocked0 heading 3;\lsdqformat1 \lsdpriority9 \lsdlocked0 heading 4;
+\lsdqformat1 \lsdpriority9 \lsdlocked0 heading 5;\lsdqformat1 \lsdpriority9 \lsdlocked0 heading 6;\lsdqformat1 \lsdpriority9 \lsdlocked0 heading 7;\lsdqformat1 \lsdpriority9 \lsdlocked0 heading 8;\lsdqformat1 \lsdpriority9 \lsdlocked0 heading 9;
+\lsdpriority39 \lsdlocked0 toc 1;\lsdpriority39 \lsdlocked0 toc 2;\lsdpriority39 \lsdlocked0 toc 3;\lsdpriority39 \lsdlocked0 toc 4;\lsdpriority39 \lsdlocked0 toc 5;\lsdpriority39 \lsdlocked0 toc 6;\lsdpriority39 \lsdlocked0 toc 7;
+\lsdpriority39 \lsdlocked0 toc 8;\lsdpriority39 \lsdlocked0 toc 9;\lsdqformat1 \lsdpriority35 \lsdlocked0 caption;\lsdsemihidden0 \lsdunhideused0 \lsdqformat1 \lsdpriority10 \lsdlocked0 Title;\lsdpriority1 \lsdlocked0 Default Paragraph Font;
+\lsdsemihidden0 \lsdunhideused0 \lsdqformat1 \lsdpriority11 \lsdlocked0 Subtitle;\lsdsemihidden0 \lsdunhideused0 \lsdqformat1 \lsdpriority22 \lsdlocked0 Strong;\lsdsemihidden0 \lsdunhideused0 \lsdqformat1 \lsdpriority20 \lsdlocked0 Emphasis;
+\lsdsemihidden0 \lsdunhideused0 \lsdpriority59 \lsdlocked0 Table Grid;\lsdunhideused0 \lsdlocked0 Placeholder Text;\lsdsemihidden0 \lsdunhideused0 \lsdqformat1 \lsdpriority1 \lsdlocked0 No Spacing;
+\lsdsemihidden0 \lsdunhideused0 \lsdpriority60 \lsdlocked0 Light Shading;\lsdsemihidden0 \lsdunhideused0 \lsdpriority61 \lsdlocked0 Light List;\lsdsemihidden0 \lsdunhideused0 \lsdpriority62 \lsdlocked0 Light Grid;
+\lsdsemihidden0 \lsdunhideused0 \lsdpriority63 \lsdlocked0 Medium Shading 1;\lsdsemihidden0 \lsdunhideused0 \lsdpriority64 \lsdlocked0 Medium Shading 2;\lsdsemihidden0 \lsdunhideused0 \lsdpriority65 \lsdlocked0 Medium List 1;
+\lsdsemihidden0 \lsdunhideused0 \lsdpriority66 \lsdlocked0 Medium List 2;\lsdsemihidden0 \lsdunhideused0 \lsdpriority67 \lsdlocked0 Medium Grid 1;\lsdsemihidden0 \lsdunhideused0 \lsdpriority68 \lsdlocked0 Medium Grid 2;
+\lsdsemihidden0 \lsdunhideused0 \lsdpriority69 \lsdlocked0 Medium Grid 3;\lsdsemihidden0 \lsdunhideused0 \lsdpriority70 \lsdlocked0 Dark List;\lsdsemihidden0 \lsdunhideused0 \lsdpriority71 \lsdlocked0 Colorful Shading;
+\lsdsemihidden0 \lsdunhideused0 \lsdpriority72 \lsdlocked0 Colorful List;\lsdsemihidden0 \lsdunhideused0 \lsdpriority73 \lsdlocked0 Colorful Grid;\lsdsemihidden0 \lsdunhideused0 \lsdpriority60 \lsdlocked0 Light Shading Accent 1;
+\lsdsemihidden0 \lsdunhideused0 \lsdpriority61 \lsdlocked0 Light List Accent 1;\lsdsemihidden0 \lsdunhideused0 \lsdpriority62 \lsdlocked0 Light Grid Accent 1;\lsdsemihidden0 \lsdunhideused0 \lsdpriority63 \lsdlocked0 Medium Shading 1 Accent 1;
+\lsdsemihidden0 \lsdunhideused0 \lsdpriority64 \lsdlocked0 Medium Shading 2 Accent 1;\lsdsemihidden0 \lsdunhideused0 \lsdpriority65 \lsdlocked0 Medium List 1 Accent 1;\lsdunhideused0 \lsdlocked0 Revision;
+\lsdsemihidden0 \lsdunhideused0 \lsdqformat1 \lsdpriority34 \lsdlocked0 List Paragraph;\lsdsemihidden0 \lsdunhideused0 \lsdqformat1 \lsdpriority29 \lsdlocked0 Quote;\lsdsemihidden0 \lsdunhideused0 \lsdqformat1 \lsdpriority30 \lsdlocked0 Intense Quote;
+\lsdsemihidden0 \lsdunhideused0 \lsdpriority66 \lsdlocked0 Medium List 2 Accent 1;\lsdsemihidden0 \lsdunhideused0 \lsdpriority67 \lsdlocked0 Medium Grid 1 Accent 1;\lsdsemihidden0 \lsdunhideused0 \lsdpriority68 \lsdlocked0 Medium Grid 2 Accent 1;
+\lsdsemihidden0 \lsdunhideused0 \lsdpriority69 \lsdlocked0 Medium Grid 3 Accent 1;\lsdsemihidden0 \lsdunhideused0 \lsdpriority70 \lsdlocked0 Dark List Accent 1;\lsdsemihidden0 \lsdunhideused0 \lsdpriority71 \lsdlocked0 Colorful Shading Accent 1;
+\lsdsemihidden0 \lsdunhideused0 \lsdpriority72 \lsdlocked0 Colorful List Accent 1;\lsdsemihidden0 \lsdunhideused0 \lsdpriority73 \lsdlocked0 Colorful Grid Accent 1;\lsdsemihidden0 \lsdunhideused0 \lsdpriority60 \lsdlocked0 Light Shading Accent 2;
+\lsdsemihidden0 \lsdunhideused0 \lsdpriority61 \lsdlocked0 Light List Accent 2;\lsdsemihidden0 \lsdunhideused0 \lsdpriority62 \lsdlocked0 Light Grid Accent 2;\lsdsemihidden0 \lsdunhideused0 \lsdpriority63 \lsdlocked0 Medium Shading 1 Accent 2;
+\lsdsemihidden0 \lsdunhideused0 \lsdpriority64 \lsdlocked0 Medium Shading 2 Accent 2;\lsdsemihidden0 \lsdunhideused0 \lsdpriority65 \lsdlocked0 Medium List 1 Accent 2;\lsdsemihidden0 \lsdunhideused0 \lsdpriority66 \lsdlocked0 Medium List 2 Accent 2;
+\lsdsemihidden0 \lsdunhideused0 \lsdpriority67 \lsdlocked0 Medium Grid 1 Accent 2;\lsdsemihidden0 \lsdunhideused0 \lsdpriority68 \lsdlocked0 Medium Grid 2 Accent 2;\lsdsemihidden0 \lsdunhideused0 \lsdpriority69 \lsdlocked0 Medium Grid 3 Accent 2;
+\lsdsemihidden0 \lsdunhideused0 \lsdpriority70 \lsdlocked0 Dark List Accent 2;\lsdsemihidden0 \lsdunhideused0 \lsdpriority71 \lsdlocked0 Colorful Shading Accent 2;\lsdsemihidden0 \lsdunhideused0 \lsdpriority72 \lsdlocked0 Colorful List Accent 2;
+\lsdsemihidden0 \lsdunhideused0 \lsdpriority73 \lsdlocked0 Colorful Grid Accent 2;\lsdsemihidden0 \lsdunhideused0 \lsdpriority60 \lsdlocked0 Light Shading Accent 3;\lsdsemihidden0 \lsdunhideused0 \lsdpriority61 \lsdlocked0 Light List Accent 3;
+\lsdsemihidden0 \lsdunhideused0 \lsdpriority62 \lsdlocked0 Light Grid Accent 3;\lsdsemihidden0 \lsdunhideused0 \lsdpriority63 \lsdlocked0 Medium Shading 1 Accent 3;\lsdsemihidden0 \lsdunhideused0 \lsdpriority64 \lsdlocked0 Medium Shading 2 Accent 3;
+\lsdsemihidden0 \lsdunhideused0 \lsdpriority65 \lsdlocked0 Medium List 1 Accent 3;\lsdsemihidden0 \lsdunhideused0 \lsdpriority66 \lsdlocked0 Medium List 2 Accent 3;\lsdsemihidden0 \lsdunhideused0 \lsdpriority67 \lsdlocked0 Medium Grid 1 Accent 3;
+\lsdsemihidden0 \lsdunhideused0 \lsdpriority68 \lsdlocked0 Medium Grid 2 Accent 3;\lsdsemihidden0 \lsdunhideused0 \lsdpriority69 \lsdlocked0 Medium Grid 3 Accent 3;\lsdsemihidden0 \lsdunhideused0 \lsdpriority70 \lsdlocked0 Dark List Accent 3;
+\lsdsemihidden0 \lsdunhideused0 \lsdpriority71 \lsdlocked0 Colorful Shading Accent 3;\lsdsemihidden0 \lsdunhideused0 \lsdpriority72 \lsdlocked0 Colorful List Accent 3;\lsdsemihidden0 \lsdunhideused0 \lsdpriority73 \lsdlocked0 Colorful Grid Accent 3;
+\lsdsemihidden0 \lsdunhideused0 \lsdpriority60 \lsdlocked0 Light Shading Accent 4;\lsdsemihidden0 \lsdunhideused0 \lsdpriority61 \lsdlocked0 Light List Accent 4;\lsdsemihidden0 \lsdunhideused0 \lsdpriority62 \lsdlocked0 Light Grid Accent 4;
+\lsdsemihidden0 \lsdunhideused0 \lsdpriority63 \lsdlocked0 Medium Shading 1 Accent 4;\lsdsemihidden0 \lsdunhideused0 \lsdpriority64 \lsdlocked0 Medium Shading 2 Accent 4;\lsdsemihidden0 \lsdunhideused0 \lsdpriority65 \lsdlocked0 Medium List 1 Accent 4;
+\lsdsemihidden0 \lsdunhideused0 \lsdpriority66 \lsdlocked0 Medium List 2 Accent 4;\lsdsemihidden0 \lsdunhideused0 \lsdpriority67 \lsdlocked0 Medium Grid 1 Accent 4;\lsdsemihidden0 \lsdunhideused0 \lsdpriority68 \lsdlocked0 Medium Grid 2 Accent 4;
+\lsdsemihidden0 \lsdunhideused0 \lsdpriority69 \lsdlocked0 Medium Grid 3 Accent 4;\lsdsemihidden0 \lsdunhideused0 \lsdpriority70 \lsdlocked0 Dark List Accent 4;\lsdsemihidden0 \lsdunhideused0 \lsdpriority71 \lsdlocked0 Colorful Shading Accent 4;
+\lsdsemihidden0 \lsdunhideused0 \lsdpriority72 \lsdlocked0 Colorful List Accent 4;\lsdsemihidden0 \lsdunhideused0 \lsdpriority73 \lsdlocked0 Colorful Grid Accent 4;\lsdsemihidden0 \lsdunhideused0 \lsdpriority60 \lsdlocked0 Light Shading Accent 5;
+\lsdsemihidden0 \lsdunhideused0 \lsdpriority61 \lsdlocked0 Light List Accent 5;\lsdsemihidden0 \lsdunhideused0 \lsdpriority62 \lsdlocked0 Light Grid Accent 5;\lsdsemihidden0 \lsdunhideused0 \lsdpriority63 \lsdlocked0 Medium Shading 1 Accent 5;
+\lsdsemihidden0 \lsdunhideused0 \lsdpriority64 \lsdlocked0 Medium Shading 2 Accent 5;\lsdsemihidden0 \lsdunhideused0 \lsdpriority65 \lsdlocked0 Medium List 1 Accent 5;\lsdsemihidden0 \lsdunhideused0 \lsdpriority66 \lsdlocked0 Medium List 2 Accent 5;
+\lsdsemihidden0 \lsdunhideused0 \lsdpriority67 \lsdlocked0 Medium Grid 1 Accent 5;\lsdsemihidden0 \lsdunhideused0 \lsdpriority68 \lsdlocked0 Medium Grid 2 Accent 5;\lsdsemihidden0 \lsdunhideused0 \lsdpriority69 \lsdlocked0 Medium Grid 3 Accent 5;
+\lsdsemihidden0 \lsdunhideused0 \lsdpriority70 \lsdlocked0 Dark List Accent 5;\lsdsemihidden0 \lsdunhideused0 \lsdpriority71 \lsdlocked0 Colorful Shading Accent 5;\lsdsemihidden0 \lsdunhideused0 \lsdpriority72 \lsdlocked0 Colorful List Accent 5;
+\lsdsemihidden0 \lsdunhideused0 \lsdpriority73 \lsdlocked0 Colorful Grid Accent 5;\lsdsemihidden0 \lsdunhideused0 \lsdpriority60 \lsdlocked0 Light Shading Accent 6;\lsdsemihidden0 \lsdunhideused0 \lsdpriority61 \lsdlocked0 Light List Accent 6;
+\lsdsemihidden0 \lsdunhideused0 \lsdpriority62 \lsdlocked0 Light Grid Accent 6;\lsdsemihidden0 \lsdunhideused0 \lsdpriority63 \lsdlocked0 Medium Shading 1 Accent 6;\lsdsemihidden0 \lsdunhideused0 \lsdpriority64 \lsdlocked0 Medium Shading 2 Accent 6;
+\lsdsemihidden0 \lsdunhideused0 \lsdpriority65 \lsdlocked0 Medium List 1 Accent 6;\lsdsemihidden0 \lsdunhideused0 \lsdpriority66 \lsdlocked0 Medium List 2 Accent 6;\lsdsemihidden0 \lsdunhideused0 \lsdpriority67 \lsdlocked0 Medium Grid 1 Accent 6;
+\lsdsemihidden0 \lsdunhideused0 \lsdpriority68 \lsdlocked0 Medium Grid 2 Accent 6;\lsdsemihidden0 \lsdunhideused0 \lsdpriority69 \lsdlocked0 Medium Grid 3 Accent 6;\lsdsemihidden0 \lsdunhideused0 \lsdpriority70 \lsdlocked0 Dark List Accent 6;
+\lsdsemihidden0 \lsdunhideused0 \lsdpriority71 \lsdlocked0 Colorful Shading Accent 6;\lsdsemihidden0 \lsdunhideused0 \lsdpriority72 \lsdlocked0 Colorful List Accent 6;\lsdsemihidden0 \lsdunhideused0 \lsdpriority73 \lsdlocked0 Colorful Grid Accent 6;
+\lsdsemihidden0 \lsdunhideused0 \lsdqformat1 \lsdpriority19 \lsdlocked0 Subtle Emphasis;\lsdsemihidden0 \lsdunhideused0 \lsdqformat1 \lsdpriority21 \lsdlocked0 Intense Emphasis;
+\lsdsemihidden0 \lsdunhideused0 \lsdqformat1 \lsdpriority31 \lsdlocked0 Subtle Reference;\lsdsemihidden0 \lsdunhideused0 \lsdqformat1 \lsdpriority32 \lsdlocked0 Intense Reference;
+\lsdsemihidden0 \lsdunhideused0 \lsdqformat1 \lsdpriority33 \lsdlocked0 Book Title;\lsdpriority37 \lsdlocked0 Bibliography;\lsdqformat1 \lsdpriority39 \lsdlocked0 TOC Heading;}}{\*\datastore 010500000200000018000000
+4d73786d6c322e534158584d4c5265616465722e352e30000000000000000000000e0000
+d0cf11e0a1b11ae1000000000000000000000000000000003e000300feff0900060000000000000000000000010000000100000000000000001000000200000001000000feffffff0000000000000000ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
+ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
+ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
+ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
+fffffffffffffffffdffffff05000000feffffff04000000fefffffffeffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
+ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
+ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
+ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
+ffffffffffffffffffffffffffffffff52006f006f007400200045006e00740072007900000000000000000000000000000000000000000000000000000000000000000000000000000000000000000016000500ffffffffffffffff01000000ec69d9888b8b3d4c859eaf6cd158be0f0000000000000000000000000076
+bb6efd66cc0103000000c0030000000000004d0073006f004400610074006100530074006f0072006500000000000000000000000000000000000000000000000000000000000000000000000000000000001a000101ffffffffffffffff0200000000000000000000000000000000000000000000000076bb6efd66cc01
+0076bb6efd66cc010000000000000000000000003500cb004c0053004a004300ca00d80044005500470056003000cd0045004500d100c3004c00c000cd0051003d003d000000000000000000000000000000000032000101ffffffffffffffff0300000000000000000000000000000000000000000000000076bb6efd66
+cc010076bb6efd66cc010000000000000000000000004900740065006d0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000a000201ffffffff04000000ffffffff000000000000000000000000000000000000000000000000
+0000000000000000000000000000000016020000000000000100000002000000030000000400000005000000060000000700000008000000feffffff0a0000000b0000000c0000000d0000000e000000feffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
+ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
+ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
+ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
+ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff3c623a536f757263657320786d6c6e733a623d22687474703a2f2f736368656d61732e6f70656e786d6c666f726d6174732e6f72672f6f6666696365446f63756d656e742f323030362f6269626c696f6772617068792220786d6c6e733d
+22687474703a2f2f736368656d61732e6f70656e786d6c666f726d6174732e6f72672f6f6666696365446f63756d656e742f323030362f6269626c696f677261706879222053656c65637465645374796c653d225c4150412e58534c22205374796c654e616d653d22415041223e3c623a536f757263653e3c623a546167
+3e4b72613c2f623a5461673e3c623a536f75726365547970653e426f6f6b3c2f623a536f75726365547970653e3c623a477569643e7b32313839323034362d453338412d344136382d383931312d3837313145343731453345347d3c2f623a477569643e3c623a4c4349443e303c2f623a4c4349443e3c623a417574686f
+723e3c623a417574686f723e3c623a4e616d654c6973743e3c623a506572736f6e3e3c623a4c6173743e4b72616d65723c2f623a4c6173743e3c2f623a506572736f6e3e3c2f623a4e616d654c6973743e3c2f623a417574686f723e3c2f623a417574686f723e3c623a5469746c653e486f7720746f207573652054696b
+613c2f623a5469746c653e3c623a5265664f726465723e313c2f623a5265664f726465723e3c2f623a536f757263653e3c2f623a536f75726365733e0d0a68aa1a083ae995719ac16db8ec8e4052164e89d93b64b060828e6f37ed1567914b284d262452282e31983c3f786d6c2076657273696f6e3d22312e302220656e
+636f64696e673d225554462d3822207374616e64616c6f6e653d226e6f223f3e0d0a3c64733a6461746173746f72654974656d2064733a6974656d49443d227b32344432423237452d423832412d343130442d393536412d4431303443363332453042357d2220786d6c6e733a64733d22687474703a2f2f736368656d61
+732e6f70656e786d6c666f726d6174732e6f72672f6f6666696365446f63756d656e742f323030362f637573746f6d586d6c223e3c64733a736368656d61526566733e3c64733a736368656d615265662064733a7572693d22687474703a2f2f736368656d61732e6f70656e786d6c666f726d6174732e6f72672f6f6666
+696365446f63756d656e742f323030362f6269626c696f677261706879222f3e3c2f64733a736368656d61526566733e3c2f64733a6461746173746f72654974656d3e68656d65312e786d6c504b01022d00140006000800000021000dd1909fb60000001b01000027000000000000000000000000009b0900007468656d
+652f7468656d652f5f72656c732f7468656d654d616e616765722e786d6c2e72656c73504b050600000000050005005d500072006f007000650072007400690065007300000000000000000000000000000000000000000000000000000000000000000000000000000000000000000016000200ffffffffffffffffffff
+ffff0000000000000000000000000000000000000000000000000000000000000000000000000900000055010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000ffffffffffffffff
+ffffffff0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000ffffffffffff
+ffffffffffff0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000ffffffff
+ffffffffffffffff0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000105000000000000}}
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testSVG.svg b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testSVG.svg
new file mode 100644
index 0000000..f78a87d
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testSVG.svg
@@ -0,0 +1,7 @@
+<?xml version="1.0"?>
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" 
+          "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<svg width="1cm" height="1cm" version="1.1" xmlns="http://www.w3.org/2000/svg">
+  <desc>Test SVG image</desc>
+  <rect x="0.1cm" y="0.1cm" width="0.8cm" height="0.8cm"/>
+</svg>
\ No newline at end of file
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testTIFF.tif b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testTIFF.tif
new file mode 100644
index 0000000..8f6c7ab
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testTIFF.tif differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testTrueType.ttf b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testTrueType.ttf
new file mode 100644
index 0000000..0e1487b
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testTrueType.ttf differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testVISIO.vsd b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testVISIO.vsd
new file mode 100644
index 0000000..d699e11
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testVISIO.vsd differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testVORBIS.ogg b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testVORBIS.ogg
new file mode 100644
index 0000000..1a02d22
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testVORBIS.ogg differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testWAR.war b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testWAR.war
new file mode 100644
index 0000000..3cdcf5b
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testWAR.war differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testWAV.wav b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testWAV.wav
new file mode 100644
index 0000000..59a063e
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testWAV.wav differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testWINMAIL.dat b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testWINMAIL.dat
new file mode 100644
index 0000000..4cdaa36
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testWINMAIL.dat differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testWMA.wma b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testWMA.wma
new file mode 100644
index 0000000..ec2e9bd
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testWMA.wma differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testWMF.wmf b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testWMF.wmf
new file mode 100644
index 0000000..f281df5
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testWMF.wmf differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testWMV.wmv b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testWMV.wmv
new file mode 100644
index 0000000..d5e67e6
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testWMV.wmv differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testWORD_various.doc b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testWORD_various.doc
new file mode 100644
index 0000000..a2ad236
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testWORD_various.doc differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testWindows-x86-32.exe b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testWindows-x86-32.exe
new file mode 100644
index 0000000..b1a38b1
Binary files /dev/null and b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testWindows-x86-32.exe differ
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testXML.xml b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testXML.xml
new file mode 100644
index 0000000..a01a402
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-documents/testXML.xml
@@ -0,0 +1,48 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements.  See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+<oaidc:dc xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:oaidc="http://www.openarchives.org/OAI/2.0/oai_dc/">
+
+	<dc:title>Tika test document</dc:title>
+
+	<dc:creator>Rida Benjelloun</dc:creator>
+
+	<dc:subject>Java</dc:subject>
+
+	<dc:subject>XML</dc:subject>
+
+	<dc:subject>XSLT</dc:subject>
+
+	<dc:subject>JDOM</dc:subject>
+ 
+	<dc:subject>Indexation</dc:subject>
+
+	<dc:description>Framework d'indexation des documents XML, HTML, PDF etc.. </dc:description>
+
+	<dc:identifier>http://www.apache.org</dc:identifier>
+
+	<dc:date>2000-12-01T00:00:00.000Z</dc:date>
+
+	<dc:type>test</dc:type>
+
+	<dc:format>application/msword</dc:format>
+
+	<dc:language>Fr</dc:language>
+
+	<dc:rights>Archimède et Lius à Châteauneuf testing chars en été</dc:rights>	
+
+</oaidc:dc>
\ No newline at end of file
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-morphlines/grokIfNotMatchDropRecord.conf b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-morphlines/grokIfNotMatchDropRecord.conf
new file mode 100644
index 0000000..bdd56ee
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-morphlines/grokIfNotMatchDropRecord.conf
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#  http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+morphlines : [
+  {
+    id : morphline1
+    importCommands : ["org.kitesdk.**"]
+
+    commands : [
+      {
+        if {
+          conditions : [
+            {
+              not {
+                grok {
+                  dictionaryString : """
+                    POSINT \b(?:[1-9][0-9]*)\b
+SYSLOGTIMESTAMP %{MONTH} +%{MONTHDAY} %{TIME}
+# Months: January, Feb, 3, 03, 12, December
+MONTH \b(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|Jun(?:e)?|Jul(?:y)?|Aug(?:ust)?|Sep(?:tember)?|Oct(?:ober)?|Nov(?:ember)?|Dec(?:ember)?)\b
+MONTHNUM (?:0?[1-9]|1[0-2])
+MONTHDAY (?:(?:0[1-9])|(?:[12][0-9])|(?:3[01])|[1-9])
+HOUR (?:2[0123]|[01]?[0-9])
+MINUTE (?:[0-5][0-9])
+# '60' is a leap second in most time standards and thus is valid.
+SECOND (?:(?:[0-5][0-9]|60)(?:[:.,][0-9]+)?)
+TIME (?!<[0-9])%{HOUR}:%{MINUTE}(?::%{SECOND})(?![0-9])
+IP (?<![0-9])(?:(?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2}))(?![0-9])
+HOSTNAME \b(?:[0-9A-Za-z][0-9A-Za-z-]{0,62})(?:\.(?:[0-9A-Za-z][0-9A-Za-z-]{0,62}))*(\.?|\b)
+HOST %{HOSTNAME}
+IPORHOST (?:%{HOSTNAME}|%{IP})
+SYSLOGHOST %{IPORHOST}
+DATA .*?
+GREEDYDATA .*
+                  """
+
+                  expressions : {
+                    message : """<%{POSINT:syslog_pri}>%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}"""
+                  }
+                  extract : inplace
+                  numRequiredMatches : all # default is atLeastOnce
+                  findSubstrings : false
+                  addEmptyStrings : false
+                }
+              }
+            }
+          ]
+          then : [
+            { logDebug { format : "found no grok match; dropping record: {}", args : ["@{}"] } }
+            { dropRecord {} }
+          ]
+          else : [
+            { logDebug { format : "found grok match; retaining record: {}", args : ["@{}"] } }
+          ]
+        }
+      }
+
+      { logDebug { format : "output record: {}", args : ["@{}"] } }
+    ]
+  }
+]
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-morphlines/ifDetectMimeType.conf b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-morphlines/ifDetectMimeType.conf
new file mode 100644
index 0000000..cfe0893
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-morphlines/ifDetectMimeType.conf
@@ -0,0 +1,74 @@
+# Copyright 2013 Cloudera Inc.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# this variable can be overriden in flume.conf or MapReduceIndexerTool CLI
+MY.MIME_TYPE : myDefault 
+
+# This morphline routes the record to the southpole if it's an avro file,
+# otherwise it routes the record to the northpole.
+morphlines : [
+  {
+    id : morphline1
+    importCommands : ["org.kitesdk.**", "org.apache.solr.**"]
+
+    commands : [
+      {
+        # auto-detect MIME type if it isn't explicitly supplied
+        detectMimeType {
+          includeDefaultMimeTypes : true
+          mimeTypesFiles : [target/test-classes/custom-mimetypes.xml]
+#          mimeTypesString :
+#            """
+#              <mime-info>
+#                <mime-type type="text/space-separated-values">
+#                  <glob pattern="*.ssv"/>
+#                </mime-type>
+#
+#                <mime-type type="avro/binary">
+#                  <magic priority="50">
+#                    <match value="0x4f626a01" type="string" offset="0"/>
+#                  </magic>
+#                  <glob pattern="*.avro"/>
+#                </mime-type>
+#
+#                <mime-type type="mytwittertest/json+delimited+length">
+#                  <magic priority="50">
+#                    <match value="[0-9]+(\r)?\n\\{&quot;" type="regex" offset="0:16"/>
+#                  </magic>
+#                </mime-type>
+#              </mime-info>
+#            """
+        }
+      }
+
+      {
+        if {
+          conditions : [
+            { contains { _attachment_mimetype : [${MY.MIME_TYPE}] } }
+          ]
+          then : [
+            { logDebug { format : "found grok match: {}", args : ["@{}"] } }
+            { setValues { "flume.selector.header" : goToSouthPole } }
+          ]
+          else : [
+            { logDebug { format : "found no grok match: {}", args : ["@{}"] } }
+            { setValues { "flume.selector.header" : goToNorthPole } }
+          ]
+        }
+      }
+
+      { logDebug { format : "output record: {}", args : ["@{}"] } }
+    ]
+  }
+]
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-morphlines/noOperation.conf b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-morphlines/noOperation.conf
new file mode 100644
index 0000000..f5b493c
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-morphlines/noOperation.conf
@@ -0,0 +1,27 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#  http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+morphlines : [
+  {
+    id : morphline1
+    importCommands : ["org.kitesdk.**"]
+
+    commands : [
+      { logDebug { format : "output record: {}", args : ["@{}"] } }
+    ]
+  }
+]
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-morphlines/readClob.conf b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-morphlines/readClob.conf
new file mode 100644
index 0000000..ac9df9b
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-morphlines/readClob.conf
@@ -0,0 +1,32 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#  http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+morphlines : [
+  {
+    id : morphline1
+    importCommands : ["org.kitesdk.**", "org.apache.solr.**"]
+
+    commands : [
+      {
+        readClob {
+          charset : UTF-8
+        }
+      }
+      { logDebug { format : "output record: {}", args : ["@{}"] } }
+    ]
+  }
+]
diff --git a/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-morphlines/solrCellDocumentTypes.conf b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-morphlines/solrCellDocumentTypes.conf
new file mode 100644
index 0000000..88e6345
--- /dev/null
+++ b/code/flume-ng-sinks/flume-ng-morphline-solr-sink/src/test/resources/test-morphlines/solrCellDocumentTypes.conf
@@ -0,0 +1,260 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#  http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# Application configuration file in HOCON format (Human-Optimized Config Object Notation).
+# HOCON syntax is defined at http://github.com/typesafehub/config/blob/master/HOCON.md
+# and also used by Akka (http://www.akka.io) and Play (http://www.playframework.org/).
+# For more examples see http://doc.akka.io/docs/akka/2.1.2/general/configuration.html
+
+# morphline.conf example file
+# this is a comment
+// this is yet another comment
+
+morphlines : [
+  {
+    id : morphline1
+    importCommands : ["org.kitesdk.**", "org.apache.solr.**"]
+
+    commands : [
+      { separateAttachments {} }
+
+      # java command that doesn't do anything except for test compilation
+      {
+        java {
+          imports : "import java.util.*;"
+          code: """
+            List tags = record.get("javaWithImports");
+            return child.process(record);
+                """
+        }
+      }
+
+      # java command that doesn't do anything except for test compilation
+      {
+        java {
+          code: """
+            List tags = record.get("javaWithoutImports");
+            return child.process(record);
+                """
+        }
+      }
+
+      {
+        # used for auto-detection if MIME type isn't explicitly supplied
+        detectMimeType {
+          includeDefaultMimeTypes : true
+          mimeTypesFiles : [target/test-classes/custom-mimetypes.xml]
+        }
+      }
+
+      {
+        tryRules {
+          throwExceptionIfAllRulesFailed : true
+          rules : [
+            # next top-level rule:
+            {
+              commands : [
+                { logDebug { format : "hello unpack" } }
+                { unpack {} }
+                { generateUUID {} }
+                { callParentPipe {} }
+              ]
+            }
+
+            {
+              commands : [
+                { logDebug { format : "hello decompress" } }
+                { decompress {} }
+                { callParentPipe {} }
+              ]
+            }
+
+            {
+              commands : [
+                {
+                  readAvroContainer {
+                    supportedMimeTypes : [avro/binary]
+                    # readerSchemaString : "<json can go here>" # optional, avro json schema blurb for getSchema()
+                    # readerSchemaFile : /path/to/syslog.avsc
+                  }
+                }
+
+                { extractAvroTree {} }
+
+                {
+                  setValues {
+                    id : "@{/id}"
+                    user_screen_name : "@{/user_screen_name}"
+                    text : "@{/text}"
+                  }
+                }
+
+                {
+                  sanitizeUnknownSolrFields {
+                    solrLocator : ${SOLR_LOCATOR}
+                  }
+                }
+              ]
+            }
+
+            {
+              commands : [
+                {
+                  readJsonTestTweets {
+                    supportedMimeTypes : ["mytwittertest/json+delimited+length"]
+                  }
+                }
+
+                {
+                  sanitizeUnknownSolrFields {
+                    solrLocator : ${SOLR_LOCATOR}
+                  }
+                }
+              ]
+            }
+
+            # next top-level rule:
+            {
+              commands : [
+                { logDebug { format : "hello solrcell" } }
+                {
+                  # wrap SolrCell around an HTML Tika parser
+                  solrCell {
+                    solrLocator : ${SOLR_LOCATOR}
+                    # captureAttr : true # default is false
+                    capture : [
+
+                      # twitter feed schema
+                      user_friends_count
+                      user_location
+                      user_description
+                      user_statuses_count
+                      user_followers_count
+                      user_name
+                      user_screen_name
+                      created_at
+                      text
+                      retweet_count
+                      retweeted
+                      in_reply_to_user_id
+                      source
+                      in_reply_to_status_id
+                      media_url_https
+                      expanded_url
+
+                      # file metadata
+                      file_download_url
+                      file_upload_url
+                      file_scheme
+                      file_host
+                      file_port
+                      file_path
+                      file_name
+                      file_length
+                      file_last_modified
+                      file_owner
+                      file_group
+                      file_permissions_user
+                      file_permissions_group
+                      file_permissions_other
+                      file_permissions_stickybit
+                    ]
+
+                    fmap : { content : text, content-type : content_type } # rename "content" field to "text" fields
+                    dateFormats : [ "yyyy-MM-dd'T'HH:mm:ss", "yyyy-MM-dd"] # various java.text.SimpleDateFormat
+                    # xpath : "/xhtml:html/xhtml:body/xhtml:div/descendant:node()"
+                    uprefix : "ignored_"
+                    lowernames : true
+                    # solrContentHandlerFactory : org.apache.solr.tika.TrimSolrContentHandlerFactory
+
+                    # Tika parsers to be registered. If multiple parsers support the same MIME type,
+                    # the parser is chosen that is closest to the bottom in this list:
+                    parsers : [
+                      { parser : org.apache.tika.parser.asm.ClassParser }
+                      # { parser : org.gagravarr.tika.OggParser, additionalSupportedMimeTypes : [audio/ogg] }
+                      { parser : org.gagravarr.tika.FlacParser }
+                      { parser : org.apache.tika.parser.audio.AudioParser }
+                      { parser : org.apache.tika.parser.audio.MidiParser }
+                      { parser : org.apache.tika.parser.crypto.Pkcs7Parser }
+                      { parser : org.apache.tika.parser.dwg.DWGParser }
+                      { parser : org.apache.tika.parser.epub.EpubParser }
+                      { parser : org.apache.tika.parser.executable.ExecutableParser }
+                      { parser : org.apache.tika.parser.feed.FeedParser }
+                      { parser : org.apache.tika.parser.font.AdobeFontMetricParser }
+                      { parser : org.apache.tika.parser.font.TrueTypeParser }
+                      { parser : org.apache.tika.parser.xml.XMLParser }
+                      { parser : org.apache.tika.parser.html.HtmlParser }
+                      { parser : org.apache.tika.parser.image.ImageParser }
+                      { parser : org.apache.tika.parser.image.PSDParser }
+                      { parser : org.apache.tika.parser.image.TiffParser }
+                      { parser : org.apache.tika.parser.iptc.IptcAnpaParser }
+                      { parser : org.apache.tika.parser.iwork.IWorkPackageParser }
+                      { parser : org.apache.tika.parser.jpeg.JpegParser }
+                      { parser : org.apache.tika.parser.mail.RFC822Parser }
+                      { parser : org.apache.tika.parser.mbox.MboxParser, additionalSupportedMimeTypes : [message/x-emlx] }
+                      { parser : org.apache.tika.parser.microsoft.OfficeParser }
+                      { parser : org.apache.tika.parser.microsoft.TNEFParser }
+                      { parser : org.apache.tika.parser.microsoft.ooxml.OOXMLParser }
+                      { parser : org.apache.tika.parser.mp3.Mp3Parser }
+                      { parser : org.apache.tika.parser.mp4.MP4Parser }
+                      { parser : org.apache.tika.parser.hdf.HDFParser }
+                      { parser : org.apache.tika.parser.netcdf.NetCDFParser }
+                      { parser : org.apache.tika.parser.odf.OpenDocumentParser }
+                      { parser : org.apache.tika.parser.pdf.PDFParser }
+                      { parser : org.apache.tika.parser.pkg.CompressorParser }
+                      { parser : org.apache.tika.parser.pkg.PackageParser }
+                      { parser : org.apache.tika.parser.rtf.RTFParser }
+                      { parser : org.apache.tika.parser.txt.TXTParser }
+                      { parser : org.apache.tika.parser.video.FLVParser }
+                      { parser : org.apache.tika.parser.xml.DcXMLParser }
+                      { parser : org.apache.tika.parser.xml.FictionBookParser }
+                      { parser : org.apache.tika.parser.chm.ChmParser }
+                    ]
+                  }
+                }
+
+                { generateUUID { field : ignored_base_id } }
+
+                {
+                  generateSolrSequenceKey {
+                    baseIdField: ignored_base_id
+                    solrLocator : ${SOLR_LOCATOR}
+                  }
+                }
+
+              ]
+            }
+          ]
+        }
+      }
+
+      {
+        loadSolr {
+          solrLocator : ${SOLR_LOCATOR}
+        }
+      }
+
+      {
+        logDebug {
+          format : "My output record: {}"
+          args : ["@{}"]
+        }
+      }
+
+    ]
+  }
+]
diff --git a/code/flume-ng-sinks/pom.xml b/code/flume-ng-sinks/pom.xml
new file mode 100644
index 0000000..2b7bec5
--- /dev/null
+++ b/code/flume-ng-sinks/pom.xml
@@ -0,0 +1,98 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to You under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
+
+  <modelVersion>4.0.0</modelVersion>
+
+  <parent>
+    <artifactId>flume-parent</artifactId>
+    <groupId>org.apache.flume</groupId>
+    <version>1.7.0</version>
+  </parent>
+
+  <groupId>org.apache.flume</groupId>
+  <artifactId>flume-ng-sinks</artifactId>
+  <name>Flume NG Sinks</name>
+  <packaging>pom</packaging>
+
+  <build>
+    <plugins>
+      <plugin>
+        <groupId>org.apache.rat</groupId>
+        <artifactId>apache-rat-plugin</artifactId>
+      </plugin>
+    </plugins>
+  </build>
+
+  <modules>
+    <module>flume-hdfs-sink</module>
+    <module>flume-irc-sink</module>
+    <module>flume-ng-hbase-sink</module>
+    <module>flume-ng-elasticsearch-sink</module>
+    <module>flume-ng-morphline-solr-sink</module>
+    <module>flume-ng-kafka-sink</module>
+  </modules>
+
+  <profiles>
+
+    <profile>
+      <id>hadoop-1.0</id>
+      <activation>
+        <property>
+          <name>flume.hadoop.profile</name>
+          <value>1</value>
+        </property>
+      </activation>
+    </profile>
+
+    <profile>
+      <id>hadoop-2</id>
+      <activation>
+        <property>
+          <name>flume.hadoop.profile</name>
+          <value>2</value>
+        </property>
+      </activation>
+      <!-- add the flume-dataset-sink, which is only compatible with hadoop-2
+           -->
+      <modules>
+        <module>flume-dataset-sink</module>
+        <module>flume-hive-sink</module>
+      </modules>
+    </profile>
+
+    <profile>
+      <id>hbase-1</id>
+      <activation>
+        <property>
+          <name>!flume.hadoop.profile</name>
+        </property>
+      </activation>
+      <!-- add the flume-dataset-sink, which is only compatible with hadoop-2
+           -->
+      <modules>
+        <module>flume-dataset-sink</module>
+        <module>flume-hive-sink</module>
+      </modules>
+    </profile>
+
+
+  </profiles>
+
+</project>
diff --git a/image/TALKDATA_share.png b/image/TALKDATA_share.png
new file mode 100644
index 0000000..5b34978
Binary files /dev/null and b/image/TALKDATA_share.png differ
diff --git a/image/qqqun.jpg b/image/qqqun.jpg
new file mode 100644
index 0000000..49bb85c
Binary files /dev/null and b/image/qqqun.jpg differ
diff --git "a/news-bigdataproject/10\343\200\201flume-hbase-kfk\350\201\224\350\260\203.md" "b/news-bigdataproject/10\343\200\201flume-hbase-kfk\350\201\224\350\260\203.md"
new file mode 100644
index 0000000..794ac64
--- /dev/null
+++ "b/news-bigdataproject/10\343\200\201flume-hbase-kfk\350\201\224\350\260\203.md"
@@ -0,0 +1,139 @@
+﻿## 第十章：Flume+HBase+Kafka集成全流程测试
+date: 2019-1-20 20:30:01
+
+
+### 全流程测试简介
+将完成对前面所有的设计进行测试，核心是进行flume日志的采集、汇总以及发送至kafka消费、hbase保存。
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fzd3g6rboxj30go0gp43u.jpg)
+###  原始日志数据简单处理
+1、下载搜狗实验室数据
+http://www.sogou.com/labs/resource/q.php
+2、格式说明
+数据格式为:访问时间\t用户ID\t[查询词]\t该URL在返回结果中的排名\t用户点击的顺序号\t用户点击的URL
+其中，用户ID是根据用户使用浏览器访问搜索引擎时的Cookie信息自动赋值，即同一次使用浏览器输入的不同查询对应同一个用户ID
+3、日志简单处理
+1）将文件中的tab更换成逗号
+cat weblog.log|tr "\t" "," > weblog2.log
+2）将文件中的空格更换成逗号
+cat weblog2.log|tr " " "," > weblog3.log
+处理完：
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fzd3eylp5zj30l008fab1.jpg)
+###  编写模拟日志生成过程
+1、代码实现
+    实现功能是将原始日志，每次读取一行不断写入到另一个文件中（weblog-flume.log），所以这个文件就相等于服务器中日志不断增加的过程。编写完程序，将该项目打成weblogs.jar包，然后上传至bigdata-pro02.kfk.com节点和bigdata-pro03.kfk.com节点的/opt/jars目录下（目录需要提前创建）
+代码工程地址：https://github.com/changeforeda/Big-Data-Project/tree/master/code/DataProducer
+2、编写运行模拟日志程序的shell脚本
+```
+1）
+在bigdata-pro02.kfk.com节点的/opt/datas目录下，创建weblog-shell.sh脚本。
+vi weblog-shell.sh
+#/bin/bash
+echo "start log......"
+#第一个参数是原日志文件，第二个参数是日志生成输出文件
+java -jar /opt/jars/weblogs.jar /opt/datas/weblog.log /opt/datas/weblog-flume.log
+
+修改weblog-shell.sh可执行权限
+chmod 777 weblog-shell.sh
+2）
+将bigdata-pro02.kfk.com节点上的/opt/datas/目录拷贝到bigdata-pro03节点.kfk.com
+scp -r /opt/datas/ bigdata-pro03.kfk.com:/opt/datas/
+```
+3、运行测试
+/opt/datas/weblog-shell.sh
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fzdb284hefj30he0chn95.jpg)
+###  编写一些shell脚本便于执行
+1、编写启动flume服务程序的shell脚本
+```
+1.在bigdata-pro02.kfk.com节点的flume安装目录下编写flume启动脚本。
+vi flume-kfk-start.sh
+#/bin/bash
+echo "flume-2 start ......"
+bin/flume-ng agent --conf conf -f conf/flume-conf.properties -n agent2 -Dflume.root.logger=INFO,console
+2.在bigdata-pro03.kfk.com节点的flume安装目录下编写flume启动脚本。
+vi flume-kfk-start.sh
+#/bin/bash
+echo "flume-3 start ......"
+bin/flume-ng agent --conf conf -f conf/flume-conf.properties -n agent3 -Dflume.root.logger=INFO,console
+3.在bigdata-pro01.kfk.com节点的flume安装目录下编写flume启动脚本。
+vi flume-kfk-start.sh
+#/bin/bash
+echo "flume-1 start ......"
+bin/flume-ng agent --conf conf -f conf/flume-conf.properties -n agent1 -Dflume.root.logger=INFO,console
+
+```
+2、编写Kafka Consumer执行脚本
+```
+1.在bigdata-pro01.kfk.com节点的Kafka安装目录下编写Kafka Consumer执行脚本
+vi kfk-test-consumer.sh
+#/bin/bash
+echo "kfk-kafka-consumer.sh start ......"
+bin/kafka-console-consumer.sh --zookeeper bigdata-pro01.kfk.com:2181,bigdata-pro02.kfk.com:2181,bigdata-pro03.kfk.com:2181 --from-beginning --topic weblogs
+2.将kfk-test-consumer.sh脚本分发另外两个节点
+scp kfk-test-consumer.sh bigdata-pro02.kfk.com:/opt/modules/kakfa_2.11-0.8.2.1/
+scp kfk-test-consumer.sh bigdata-pro03.kfk.com:/opt/modules/kakfa_2.11-0.8.2.1/
+
+```
+###  联调测试-数据采集分发
+```
+1、在各个节点上启动zk
+/opt/modules/zookeeper-3.4.5-cdh5.10.0/sbin/zkServer.sh start  
+/opt/modules/zookeeper-3.4.5-cdh5.10.0/bin/zkCli.sh  登陆客户端进行测试是否启动成功
+
+2、启动hdfs  --- http://bigdata-pro01.kfk.com:50070/
+在节点1：/opt/modules/hadoop-2.6.0/sbin/start-dfs.sh 
+#节点1 和 节点2  启动namenode高可用
+/opt/modules/hadoop-2.6.0/sbin/hadoop-daemon.sh start zkfc
+
+3、启动hbase  ----http://bigdata-pro01.kfk.com:60010/
+#节点 1  启动hbase
+/opt/modules/hbase-1.0.0-cdh5.4.0/bin/start-hbase.sh
+#在节点2 启动备用master
+/opt/modules/hbase-1.0.0-cdh5.4.0/bin/hbase-daemon.sh start  master
+#启动hbase的shell用于操作
+/opt/modules/hbase-1.0.0-cdh5.4.0/bin/hbase shell
+#创建hbase业务表
+bin/hbase shell
+create 'weblogs','info'
+
+4、启动kafka
+#在各个个节点启动kafka
+cd /opt/modules/kafka_2.10-0.9.0.0
+bin/kafka-server-start.sh config/server.properties &
+#创建业务
+bin/kafka-topics.sh --zookeeper bigdata-pro01.kfk.com:2181,bigdata-pro02.kfk.com:2181,bigdata-pro03.kfk.com:2181 --create --topic weblogs --replication-factor 2 --partitions 1
+#消费(之前编写的脚本可以用)
+bin/kafka-console-consumer.sh --zookeeper bigdata-pro01.kfk.com:2181,bigdata-pro02.kfk.com:2181,bigdata-pro03.kfk.com:2181 --from-beginning --topic weblogs
+```
+一定确保上述都启动成功能，利用jps查看各个节点进程情况。
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fzdbmh1n31j309v042glj.jpg)![](http://ww1.sinaimg.cn/large/005BOtkIly1fzdbmovok3j309n03sa9y.jpg)![](http://ww1.sinaimg.cn/large/005BOtkIly1fzdbmw14tjj309o02cweb.jpg)
+```
+5、各个节点启动flume
+#三节点启动flume
+/opt/modules/flume-1.7.0-bin/flume-kfk-start.sh
+
+6、在节点2和3启动日志模拟生产
+/opt/datas/weblog-shell.sh
+
+7、启动kafka消费程序
+#消费（或者使用写好的脚本kfk-test-consumer.sh）
+bin/kafka-console-consumer.sh --zookeeper bigdata-pro01.kfk.com:2181,bigdata-pro02.kfk.com:2181,bigdata-pro03.kfk.com:2181 --from-beginning --topic weblogs
+
+8、查看hbase数据写入情况
+./hbase-shell
+count 'weblogs'
+```
+结果：
+kafka不断消费
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fzdbszmkybj30rh0940ue.jpg)
+hbase数据不断增加
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fzdbtek6eqj30rv0ar0ud.jpg)
+
+###  遇到的一些问题
+1、组件启动不起来
+可能原因是环境变量没设置，比如在启动flume时，因为调用java，所以要设置环境变量在flume的配置文件中。
+2、各个组件都启动了，但是没数据
+我是因为flume的sink写错了，所以根本输出不了数据，我是通过先把sink设置成输出到控制台发现没数据，再去看配置sink到底怎么了
+**3、解决各种小问题**
+1）看问题的日志；或者把日志中问题复制到百度，基本可以解决60%
+2）整个功能实现不了，应该从数据源头查看，一步一步向后排除原因，比如没数据，看源头到底输出数据了吗？
+3）问题还是解决不了，就要反思自己是否有不懂的地方，设置错了。或者you can talk with me。。。
diff --git "a/news-bigdataproject/11\343\200\201mysql-hive.md" "b/news-bigdataproject/11\343\200\201mysql-hive.md"
new file mode 100644
index 0000000..3a480f3
--- /dev/null
+++ "b/news-bigdataproject/11\343\200\201mysql-hive.md"
@@ -0,0 +1,146 @@
+﻿## 第十一章：mysql、Hive安装与集成
+date: 2019-1-22 22:30:01
+
+
+### 为什么要用mysql?
+一方面，本项目用来存储Hive的元数据；另一方面，可以把离线分析结果放入mysql中；
+
+### 安装mysql
+通过yum在线mysql，具体操作命令如下所示(关于yum源可以修改为阿里的，比较快和稳定)
+```
+1、在线安装mysql
+通过yum在线mysql，具体操作命令如下所示。
+yum clean all
+yum install mysql-server
+2、mysql 服务启动并测试
+sudo chown -R kfk:kfk /usr/bin/mysql    修改权限给kfk
+1）查看mysql服务状态
+sudo service mysqld status  
+2）启动mysql服务
+sudo service mysqld start
+3）设置mysql密码
+/usr/bin/mysqladmin -u root password '123456'
+4）连接mysql
+mysql –uroot -p123456
+a）查看数据库
+show databases;
+mysql
+test
+b）查看数据库
+use test;
+c）查看表列表
+show tables;
+```
+出现问题，大多数是权限问题，利用sudo执行或者重启mysql.
+
+### 安装Hive
+Hive在本项目中功能是，将hbase中的数据进行离线分析，输出处理结果，可以到mysql或者hbase，然后进行可视化。
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fzfpw9k0v7j30kv09rtfp.jpg)
+这里版本采用的是：apache-hive-2.1.0-bin.tar.gz
+（之前用apache-hive-0.13.1-bin.tar.gz出现和hbase集成失败，原因很奇怪，下一章详细讲）。
+1、解压
+```
+步骤都老生常谈了。。。
+tar -zxf apache-hive-2.1.0-bin.tar.gz -C /opt/modules/
+mv  apache-hive-2.1.0-bin hive-2.1.0     //重命名
+```
+2、修改配置文件
+```
+1）hive-log4j.properties
+#日志目录需要提前创建
+hive.log.dir=/opt/modules/hive-2.1.0/logs
+2）修改hive-env.sh配置文件
+HADOOP_HOME=/opt/modules/hadoop-2.6.0
+HBASE_HOME=/opt/modules/hbase-1.0.0-cdh5.4.0
+# Hive Configuration Directory can be controlled by:
+export HIVE_CONF_DIR=/opt/modules/hive-2.1.0/conf
+```
+3、启动进行测试
+首先启动HDFS，然后创建Hive的目录
+bin/hdfs dfs -mkdir -p /tmp
+bin/hdfs dfs -chmod g+w /tmp
+bin/hdfs dfs -mkdir -p /user/hive/warehouse
+bin/hdfs dfs -chmod g+w /user/hive/warehouse
+4、测试
+```
+./hive
+#查看数据库
+show databases;
+#使用默认数据库
+use default;
+#查看表
+show tables;
+
+```
+### Hive与mysql集成
+利用mysql放Hive的元数据。
+1、在/opt/modules/hive-2.1.0/conf目录下创建hive-site.xml文件，配置mysql元数据库。
+```
+<?xml version="1.0"?>
+<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
+
+
+<configuration>
+  <property>
+    <name>javax.jdo.option.ConnectionURL</name>
+    <value>jdbc:mysql://bigdata-pro01.kfk.com/metastore?createDatabaseIfNotExist=true</value>
+  </property>
+  <property>
+    <name>javax.jdo.option.ConnectionDriverName</name>
+    <value>com.mysql.jdbc.Driver</value>
+  </property>
+ <property>
+    <name>javax.jdo.option.ConnectionUserName</name>
+    <value>root</value>
+  </property>
+  <property>
+    <name>javax.jdo.option.ConnectionPassword</name>
+    <value>123456</value>
+  </property>
+  <property>
+    <name>hbase.zookeeper.quorum</name>   
+	<value>bigdata-pro01.kfk.com,bigdata-pro02.kfk.com,bigdata-pro03.kfk.com</value>
+  </property>
+
+
+</configuration>
+```
+2、设置用户连接信息
+
+1）查看用户信息
+mysql -uroot -p123456
+show databases;
+use mysql;
+show tables;
+select User,Host,Password from user;
+2）更新用户信息
+update user set Host='%' where User = 'root' and Host='localhost'
+3）删除用户信息
+delete from user where user='root' and host='127.0.0.1'
+select User,Host,Password from user;
+delete from user where host='localhost';
+删除到只剩图中这一行数据
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fzfqckmjxej30ej031q2s.jpg)
+4）刷新信息
+flush privileges;
+3.拷贝mysql驱动包到hive的lib目录下
+cp  mysql-connector-java-5.1.35.jar /opt/modules/hive-2.1.0/lib/
+4.保证第三台集群到其他节点无秘钥登录
+
+### Hive与mysql测试
+1.启动HDFS和YARN服务
+2.启动hive
+./hive
+3.通过hive服务创建表
+CREATE TABLE stu(id INT,name STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' ;
+4.创建数据文件
+vi /opt/datas/stu.txt
+00001	zhangsan
+00002	lisi
+00003	wangwu
+00004	zhaoliu
+5.加载数据到hive表中
+load data local inpath '/opt/datas/stu.txt' into table stu;
+直接在hive查看表中内容就ok。
+在mysql数据库中hive的metastore元数据。（元数据是啥，去看看hive介绍吧）
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fzfqeibkrtj306103ta9v.jpg)
\ No newline at end of file
diff --git "a/news-bigdataproject/12\343\200\201hive-hbase.md" "b/news-bigdataproject/12\343\200\201hive-hbase.md"
new file mode 100644
index 0000000..41b7a73
--- /dev/null
+++ "b/news-bigdataproject/12\343\200\201hive-hbase.md"
@@ -0,0 +1,79 @@
+﻿## 第十二章：Hive与Hbase集成
+date: 2019-1-23 21:30:01
+
+
+### Hive与HBase集成配置
+1、在hive-site.xml文件中配置Zookeeper，hive通过这个参数去连接HBase集群。
+```
+<property>
+    <name>hbase.zookeeper.quorum</name>   <value>bigdata-pro01.kfk.com,bigdata-pro02.kfk.com,bigdata-pro03.kfk.com</value>
+</property>
+```
+2、需要把hbase中的部分jar包拷贝到hive中
+这里采用软连接的方式：
+执行如下命令：
+```
+export HBASE_HOME=/opt/modules/hbase-1.0.0-cdh5.4.0
+export HIVE_HOME=/opt/modules/hive-2.1.0
+ln -s $HBASE_HOME/lib/hbase-server-1.0.0-cdh5.4.0.jar $HIVE_HOME/lib/hbase-server-1.0.0-cdh5.4.0.jar
+
+ln -s $HBASE_HOME/lib/hbase-client-1.0.0-cdh5.4.0.jar $HIVE_HOME/lib/hbase-client-1.0.0-cdh5.4.0.jar
+
+ln -s $HBASE_HOME/lib/hbase-protocol-1.0.0-cdh5.4.0.jar $HIVE_HOME/lib/hbase-protocol-1.0.0-cdh5.4.0.jar 
+
+ln -s $HBASE_HOME/lib/hbase-it-1.0.0-cdh5.4.0.jar $HIVE_HOME/lib/hbase-it-1.0.0-cdh5.4.0.jar 
+
+ln -s $HBASE_HOME/lib/htrace-core-3.0.4.jar $HIVE_HOME/lib/htrace-core-3.0.4.jar
+
+ln -s $HBASE_HOME/lib/hbase-hadoop2-compat-1.0.0-cdh5.4.0.jar $HIVE_HOME/lib/hbase-hadoop2-compat-1.0.0-cdh5.4.0.jar 
+
+ln -s $HBASE_HOME/lib/hbase-hadoop-compat-1.0.0-cdh5.4.0.jar $HIVE_HOME/lib/hbase-hadoop-compat-1.0.0-cdh5.4.0.jar
+
+ln -s $HBASE_HOME/lib/high-scale-lib-1.1.1.jar $HIVE_HOME/lib/high-scale-lib-1.1.1.jar 
+
+ln -s $HBASE_HOME/lib/hbase-common-1.0.0-cdh5.4.0.jar $HIVE_HOME/lib/hbase-common-1.0.0-cdh5.4.0.jar 
+```
+3、测试
+在hbase中建立一个表，里面存有数据（实际底层就是在hdfs上），然后Hive创建一个表与HBase中的表建立联系。
+1）先在hbase建立一个表
+（不熟悉的，看指令https://www.cnblogs.com/cxzdy/p/5583239.html）
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fzgupdmei1j30h5037mx4.jpg)
+2）启动hive,建立联系（之前要先启动mysql，因为元数据在里面）
+```
+create external table t1(
+key int,
+name string,
+age string
+)  
+STORED BY  'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
+WITH SERDEPROPERTIES("hbase.columns.mapping" = ":key,info:name,info:age") 
+TBLPROPERTIES("hbase.table.name" = "t1");
+```
+3）hive结果
+执行 select * from t1;
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fzgutrr5x7j30b0035glg.jpg)
+4、为项目中的weblogs建立联系
+之前我们把数据通过flume导入到hbase中了，所以同样我们在hive中建立联系，可以用hive对hbase中的数据进行简单的sql分析，离线分析。
+```
+create external table weblogs(
+id string,
+datatime string,
+userid string,
+searchname string,
+retorder string,
+cliorder string,
+cliurl string
+)  
+STORED BY  'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
+WITH SERDEPROPERTIES("hbase.columns.mapping" = ":key,info:datatime,info:userid,info:searchname,info:retorder,info:cliorder,info:cliurl") 
+TBLPROPERTIES("hbase.table.name" = "weblogs");
+```
+
+### Hive与HBase集成中的致命bug
+问题如图：
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fzguxe0p4ej30nu0hl0ua.jpg)
+参考办法：https://www.cnblogs.com/zlslch/p/8228781.html
+按照上述，参考还是解决不了。
+最初怀疑是hbase中的jar包没有导入到hive中，或者导入错了，结果不是这个原因。网上有个大哥也是遇到这个问题了，写了一篇日志，最后他说不知如何解决？？
+*********************************************
+最终：我去官网看看，官网上说，hbase 1.x之后的版本，需要更高版本的hive匹配，最好是hive 2.x,上述的错误是因为我用的hive-0.13.1-bin和hbase-1.0.0-cdh5.4.0，应该是不兼容导致的，莫名bug。于是采用了 hive-2.1.0，我查了下这个版本与hadoop其他组件也是兼容的，所以，采用这个。配置仍然采用刚才的方法（上一章和这一章），主要有mysql元数据配置（驱动包别忘了），各种xml配置，测试下。最后，在重启hive之前，**先把hbase重启了**，很重要。终于成功了。。开心。
diff --git "a/news-bigdataproject/13\343\200\201hue.md" "b/news-bigdataproject/13\343\200\201hue.md"
new file mode 100644
index 0000000..f22b127
--- /dev/null
+++ "b/news-bigdataproject/13\343\200\201hue.md"
@@ -0,0 +1,133 @@
+﻿## 第十三章：Cloudera HUE大数据可视化分析
+date: 2019-1-26 19:30:01
+
+
+### 下载和安装Hue
+版本选择： hue-3.9.0-cdh5.15.0
+1、首先需要利用yum安装依赖包，虚拟机需要联网，这里安装在节点3上。
+```
+yum -y install ant asciidoc cyrus-sasl-devel cyrus-sasl-gssapi gcc gcc-c++ krb5-devel libtidy libxml2-devel libxslt-devel openldap-devel python-devel sqlite-devel openssl-devel mysql-devel gmp-devel  
+```
+2、解压
+tar -zxf hue-3.9.0-cdh5.15.0.tar.gz -C /opt/modules/
+3、编译
+cd  hue-3.9.0-cdh5.15.0
+make apps
+4、基本配置与测试
+```java
+1）修改配置文件
+cd desktop
+cd conf
+vi hue.ini
+#秘钥
+secret_key=jFE93j;2[290-eiw.KEiwN2s3['d;/.q[eIW^y#e=+Iei*@Mn < qW5o
+#host port
+http_host=bigdata-pro03.kfk.com
+http_port=8888
+#时区
+time_zone=Asia/Shanghai
+2）修改desktop.db 文件权限
+chmod o+w desktop/desktop.db
+3）启动Hue服务
+/opt/modules/hue-3.9.0-cdh5.15.0/build/env/bin/supervisor
+4）查看Hue web界面
+bigdata-pro03.kfk.com:8888
+```
+### Hue与HDFS集成
+```
+1）修改hadoop中core-site.xml配置文件，添加如下内容
+<property>
+    <name>hadoop.proxyuser.hue.hosts</name>
+    <value>*</value>
+</property>
+<property>
+    <name>hadoop.proxyuser.hue.groups</name>
+    <value>*</value>
+</property>
+
+2）修改hue.ini配置文件
+fs_defaultfs=hdfs://ns
+webhdfs_url=http://bigdata-pro01.kfk.com:50070/webhdfs/v1
+hadoop_hdfs_home=/opt/modules/hadoop-2.6.0
+hadoop_bin=/opt/modules/hadoop-2.6.0/bin
+hadoop_conf_dir=/opt/modules/hadoop-2.6.0/etc/hadoop
+3）将core-site.xml配置文件分发到其他节点
+scp core-site.xml bigdata-pro02.kfk.com:/opt/modules/hadoop-2.6.0/etc/hadoop
+scp core-site.xml bigdata-pro01.kfk.com:/opt/modules/hadoop-2.6.0/etc/hadoop
+4）重新启动hue
+先启动zk,hdfs，再启动hue
+/opt/modules/hue-3.9.0-cdh5.15.0/build/env/bin/supervisor
+```
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fzk8cc0l9rj30ev0ebmx9.jpg)
+
+### Hue与YARN集成
+1、修改hue.ini配置文件,参考https://www.cnblogs.com/zlslch/p/6817226.html
+区分yarn是不是HA
+```
+ [[yarn_clusters]]
+
+    [[[default]]]
+      resourcemanager_host=rs
+      resourcemanager_port=8032
+      submit_to=True
+      logical_name=rm1
+      resourcemanager_api_url=http://bigdata-pro01.kfk.com:8088
+      proxy_api_url=http://bigdata-pro01.kfk.com:8088
+      history_server_api_url=http://bigdata-pro01.kfk.com:19888
+
+     [[[ha]]]
+      logical_name=rm2
+      submit_to=True
+      resourcemanager_api_url=http://bigdata-pro02.kfk.com:8088
+	  history_server_api_url=http://bigdata-pro01.kfk.com:19888
+```
+2、测试
+启动yarn，再重启hue。
+图中的任务是我之前进行的任务
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fzk8i4ao6kj314r07j74m.jpg)
+
+### Hue与mysql、hive集成
+1、修改hue.ini配置
+```
+    [beeswax]
+
+
+      hive_server_host=bigdata-pro03.kfk.com
+      hive_server_port=10000
+      hive_conf_dir=/opt/modules/hive-2.1.0/conf
+  
+  
+  .........中间其他......
+     [[[mysql]]]
+      nice_name="My SQL DB"
+
+      name=metastore
+      engine=mysql
+
+      host=bigdata-pro01.kfk.com
+      port=3306
+      user=root
+      password=123456
+```
+2、测试
+启动节点1的mysql（这是元数据），再启动节点3的hive服
+/opt/modules/hive-2.1.0/bin/hive --service hiveserver2 &    ##配合hue服务
+再重启hue。
+图中是利用hive中的sql查询，hive中的表。但是有一个问题是：我用hive查询hbase中的表，无法查询，出现超时情况，目前还没解决，搞了2天难受，（本来想直接在hue中用hive来处理hbase中的表进行离线计算，但是没法查询，只能查询hive本身自己的表，另外hive的beeline模式也无法查询hbase表，但是hive cli模式可以的查询）
+问题日志：:java.io.IOException: org.apache.hadoop.hbase.client.RetriesExhaustedException
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fzk8rvrxu6j30n20e3q35.jpg)
+
+### Hue与hbase集成
+1、修改hue.ini配置
+```
+[hbase]
+   hbase_clusters=(Cluster|bigdata-pro01.kfk.com:9090)
+   hbase_conf_dir=/opt/modules/hbase-1.0.0-cdh5.4.0/conf
+  thrift_transport=buffered
+```
+2、启动测试
+先启动hbase,再启动HBase中启动thrift服务
+/opt/modules/hbase-1.0.0-cdh5.4.0/bin/hbase-daemon.sh start thrift
+然后重启hue
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fzk9mr2i83j30kp0ddaa9.jpg)
+
diff --git "a/news-bigdataproject/14\343\200\201spark on yarn.md" "b/news-bigdataproject/14\343\200\201spark on yarn.md"
new file mode 100644
index 0000000..06c0931
--- /dev/null
+++ "b/news-bigdataproject/14\343\200\201spark on yarn.md"	
@@ -0,0 +1,97 @@
+﻿## 第十四章：Spark2.X集群安装与spark on yarn部署
+date: 2019-1-30 11:30:01
+
+
+### spark学习准备
+1、简介
+Spark 是一个用来实现快速而通用的集群计算的平台。
+
+Spark 的一个主要特点就是能够在内存中进行计算， 因而更快。不过即使是必须在磁盘上进行的复杂计算， Spark 依然比 MapReduce 更加高效。
+
+2、学习网站
+1）databricks 网站
+**2）spark 官网**  
+3）github 网站 中spark有很多例子
+
+### spark集群安装
+我用版本是spark-2.2.0-bin-hadoop2.6.tgz，因为我之前用的是hadoop2.6.0.
+环境要求：scala-2.11.12.tgz/java8/hadoop2.6.0.
+1、官网下载
+https://spark.apache.org/downloads.html
+2、spark配置
+配置spark-env.sh
+```
+export JAVA_HOME=/opt/modules/jdk1.8.0_191
+export SCALA_HOME=/opt/modules/scala-2.11.12
+
+export HADOOP_CONF_DIR=/opt/modules/hadoop-2.6.0/etc/hadoop
+export SPARK_CONF_DIR=/opt/modules/spark-2.2.0/conf
+export SPARK_MASTER_HOST=bigdata-pro02.kfk.com
+export SPARK_MASTER_PORT=7077
+export SPARK_MASTER_WEBUI_PORT=8080
+export SPARK_WORKER_CORES=1
+export SPARK_WORKER_MEMORY=1g
+export SPARK_WORKER_PORT=7078
+export SPARK_WORKER_WEBUI_PORT=8081
+```
+配置slaves
+```
+bigdata-pro01.kfk.com
+bigdata-pro02.kfk.com
+bigdata-pro03.kfk.com
+```
+如果整合hive,hive用到mysql数据库的话，需要将mysql数据库连接驱动jmysql-connector-java-5.1.7-bin.jar放到$SPARK_HOME/jars目录下
+3、分发至各个节点
+4、设定的主节点上启动测试(这是standalone模式)
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fzogese9lvj30on05o74m.jpg)
+打开spark服务网址：http://bigdata-pro02.kfk.com:8080/
+可以查看到各个节点的情况。
+5、可以stop-all，因为yarn模式下根本不需要。
+
+### spark on Yarn
+standalonen模式和spark on Yarn模式比较： https://blog.csdn.net/lxhandlbb/article/details/70214003
+spark on Yarn原理：https://blog.csdn.net/liuwei0376/article/details/78637732
+1、前提条件
+已经安装了hadoop2.6.0，并可以运行，因为spark运行需要依赖hadoop.
+2、运行zk、hdfs和yarn
+高可用下的zk也要运行
+hadoop:http://bigdata-pro01.kfk.com:50070
+yarn：http://bigdata-pro01.kfk.com:8088
+3、主节点运行spark
+./spark-shell --master yarn --deploy-mode client
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fzohi45ckpj30o70auq3g.jpg)
+在yarn的网页中也可以看到。
+虚拟机内存小的话，会出现问题：
+```
+17/09/08 10:36:08 ERROR spark.SparkContext: Error initializing SparkContext.
+org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.
+```
+解决办法：先停止YARN服务，然后修改yarn-site.xml，分发至各个节点。再重启。
+增加如下内容
+```
+    <property>
+        <name>yarn.nodemanager.vmem-check-enabled</name>
+        <value>false</value>
+    </property>
+    <property>
+        <name>yarn.nodemanager.vmem-pmem-ratio</name>
+        <value>4</value>
+    </property>
+```
+4、测试下程序运行
+```
+sc.parallelize(1 to 100,5).count
+```
+查看程序运行情况：
+1）入口yarn的web网页，
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fzohlg58qyj31dz0b8tam.jpg)
+2）点击applicationmaster进入
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fzoi538xtkj31ck0bwjrz.jpg)
+可能出现问题进不去网页：
+配置显示在主节点：这里配置节点1，那么RM应该在1的时候可以显示，之前我配集群总名称rs，没法用。
+修改yarn-site.xml，分发至各个节点，然后重启。
+```	<property>
+		<name>yarn.resourcemanager.webapp.address</name>
+		<value>bigdata-pro01.kfk.com:8088</value>
+	</property>
+```
\ No newline at end of file
diff --git "a/news-bigdataproject/15\343\200\201spark-idea.md" "b/news-bigdataproject/15\343\200\201spark-idea.md"
new file mode 100644
index 0000000..d78377c
--- /dev/null
+++ "b/news-bigdataproject/15\343\200\201spark-idea.md"
@@ -0,0 +1,130 @@
+﻿## 第十五章：基于IDEA环境下的Spark2.X程序开发
+date: 2019-1-30 14:30:01
+
+
+### 开发环境配置
+1、安装idea
+2、安装maven
+官网下载：apache-maven-3.6.0
+3、安装java8，并配置环境变量
+4、安装scala，直接从idea插件下载安装
+5、安装hadoop在Windows中的运行环境，并配置环境变量
+（软件下载链接：https://github.com/changeforeda/Big-Data-Project/blob/master/README.md）
+
+### IDEA程序开发
+可以参考这个链接很全：https://blog.csdn.net/zkf541076398/article/details/79297820
+1、新建maven项目
+2、配置maven
+3、选择配置scala和java版本
+4、新建scala目录并设置为source(看图)
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fzokar9bjxj30pj0dc13u.jpg)
+5、编写pom.xml文件
+这里主要你需要什么就放什么，可以github上找例子
+https://github.com/apache/spark/blob/master/examples/pom.xml
+我的pom，我自己可以用
+```
+<?xml version="1.0" encoding="UTF-8"?>
+
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
+
+  <modelVersion>4.0.0</modelVersion>
+  <packaging>war</packaging>
+
+  <name>TestSpark</name>
+  <groupId>com.kfk.spark</groupId>
+  <artifactId>TestSpark</artifactId>
+  <version>1.0-SNAPSHOT</version>
+
+
+  <properties>
+    <scala.version>2.11.12</scala.version>
+    <scala.binary.version>2.11</scala.binary.version>
+    <spark.version>2.2.0</spark.version>
+  </properties>
+
+  <dependencies>
+    <dependency>
+      <groupId>org.apache.spark</groupId>
+      <artifactId>spark-core_${scala.binary.version}</artifactId>
+      <version>${spark.version}</version>
+    </dependency>
+    <dependency>
+      <groupId>org.apache.spark</groupId>
+      <artifactId>spark-streaming_${scala.binary.version}</artifactId>
+      <version>${spark.version}</version>
+    </dependency>
+    <dependency>
+      <groupId>org.apache.spark</groupId>
+      <artifactId>spark-sql_${scala.binary.version}</artifactId>
+      <version>${spark.version}</version>
+    </dependency>
+    <dependency>
+      <groupId>org.apache.spark</groupId>
+      <artifactId>spark-hive_${scala.binary.version}</artifactId>
+      <version>${spark.version}</version>
+    </dependency>
+    <dependency>
+      <groupId>org.apache.spark</groupId>
+      <artifactId>spark-streaming-kafka-0-10_${scala.binary.version}</artifactId>
+      <version>${spark.version}</version>
+    </dependency>
+    <dependency>
+      <groupId>org.apache.hadoop</groupId>
+      <artifactId>hadoop-client</artifactId>
+      <version>2.6.0</version>
+    </dependency>
+  </dependencies>
+
+</project>
+
+```
+6、编写测试程序
+```
+import org.apache.spark.sql.SparkSession
+
+object test {
+  def main(args: Array[String]): Unit = {
+
+     val spark = SparkSession
+      .builder
+       .master("yarn-cluster")
+     //  .master("local[2]")
+      .appName("HdfsTest")
+      .getOrCreate()
+
+    val path = args(0)
+    val out = args(1)
+
+    val rdd = spark.sparkContext.textFile(path)
+    val lines = rdd.flatMap(_.split(" ")).map(x=>(x,1)).reduceByKey((a,b)=>(a+b)).saveAsTextFile(out)
+  }
+
+}
+```
+7、本地测试
+直接master("local[2]")，指定windows下的路径就可以了。如果不能运行一定是开发环境有问题，主要看看hadoop环境变量配置了吗
+8、打成jar包
+可参考：https://jingyan.baidu.com/article/c275f6ba0bbb65e33d7567cb.html
+9、上传至虚拟机中进行jar包方式提交到spark on yarn.
+运行底层还是依赖于hdfs，前提要启动zk /hadoop /yarn.
+```
+ bin/spark-submit --class  test  --master yarn --deploy-mode cluster /opt/jars/TestSpark.jar  hdfs://ns/input/stu.txt  hdfs://ns/out
+```
+运行结束去，可以在yarn的web:http://bigdata-pro01.kfk.com:8088/cluster/
+看见调度success标志。
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fzol6uy6oaj30of09hjsa.jpg)
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fzom68klrwj315f0l23zq.jpg)
+
+10、如果运行失败怎么办？看日志
+有一个比较好的入口上图圈中的logs：
+先配置yarn-site.xml
+```
+	<property>
+         <name>yarn.log.server.url</name>
+         <value>http://bigdata-pro01.kfk.com:19888/jobhistory/logs</value>
+	</property>
+```
+需要重启yarn，
+并在你配置节点启动历史服务器./mr-jobhistory-daemon.sh start historyserver
+点击：http://bigdata-pro01.kfk.com:8088/cluster
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fzom51hwo5j30pa0npaci.jpg)
\ No newline at end of file
diff --git "a/news-bigdataproject/16\343\200\201spark-streaming1.md" "b/news-bigdataproject/16\343\200\201spark-streaming1.md"
new file mode 100644
index 0000000..bce3a56
--- /dev/null
+++ "b/news-bigdataproject/16\343\200\201spark-streaming1.md"
@@ -0,0 +1,87 @@
+﻿## 第十六章：Spark Streaming实时数据处理
+date: 2019-2-03 14:30:01
+
+
+### Spark Streaming简介
+本质上就是利用批处理时间间隔来处理一小批的RDD集合。
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fzosp71irxj30g108hq75.jpg)
+
+### idea中程序测试读取socket
+1、在节点1启动nc
+nc -lk 9999
+输入一些单词
+2、在idea中运行程序
+```
+import org.apache.spark.SparkConf
+import org.apache.spark.streaming.{Seconds, StreamingContext}
+object TestStreaming {
+
+  def main(args: Array[String]): Unit = {
+
+    val conf = new SparkConf().setMaster("local[2]").setAppName("NetworkWordCount")
+    val ssc = new StreamingContext(conf, Seconds(5))
+
+    val lines = ssc.socketTextStream("bigdata-pro01.kfk.com",9999)
+    val words = lines.flatMap(_.split(" "))
+    //map reduce 计算
+    val wordCounts = words.map(x => (x, 1)).reduceByKey(_ + _)
+    wordCounts.print()
+    ssc.start()
+    ssc.awaitTermination()
+
+  }
+}
+```
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fzot0y7ib8j30tl0a4mxy.jpg)
+
+### sparkstreaming和kafka进行集成
+版本问题：
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fzr9u0l3szj31cl048js0.jpg)
+遇到了版本问题，之前用的是kafka0.9，现在和idea集成开发一般是kafka0.10了，还好官网里有支持kafka0.9程序案例，要不然就完犊子了，参考官网进行编写：
+http://spark.apache.org/docs/2.2.0/streaming-kafka-0-8-integration.html
+代码案例：https://github.com/apache/spark/blob/v2.2.0/examples/src/main/scala/org/apache/spark/examples/streaming/DirectKafkaWordCount.scala
+基于kafka0.9的测试程序
+```scala
+import kafka.serializer.StringDecoder
+import org.apache.spark.sql.SparkSession
+import org.apache.spark.streaming.kafka.KafkaUtils
+import org.apache.spark.streaming.{Seconds, StreamingContext}
+
+
+object KfkStreaming {
+   def main(args: Array[String]): Unit = {
+
+     val spark  = SparkSession.builder()
+       .master("local[2]")
+       .appName("kfkstreaming").getOrCreate()
+
+     val sc =spark.sparkContext
+     val ssc = new StreamingContext(sc, Seconds(5))
+
+     val topicsSet = Set("weblogs")
+     val kafkaParams = Map[String, String]("metadata.broker.list" -> "bigdata-pro01.kfk.com:9092")
+     val messages = KafkaUtils.createDirectStream[String, String, StringDecoder, StringDecoder](
+       ssc, kafkaParams, topicsSet)
+
+     val lines = messages.map(_._2)
+     val words = lines.flatMap(_.split(" "))
+     val wordCounts = words.map(x => (x, 1L)).reduceByKey(_ + _)
+     wordCounts.print()
+
+     // Start the computation
+     ssc.start()
+     ssc.awaitTermination()
+
+   }
+
+}
+
+```
+在节点1上启动kafka程序
+```
+bin/kafka-server-start.sh config/server.properties
+bin/kafka-console-producer.sh --broker-list bigdata-pro01.kfk.com:9092 --topic weblogs
+
+```
+运行结果：
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fzr9tkp4szj30om07paas.jpg)
\ No newline at end of file
diff --git "a/news-bigdataproject/1\343\200\201\351\241\271\347\233\256\351\234\200\346\261\202.md" "b/news-bigdataproject/1\343\200\201\351\241\271\347\233\256\351\234\200\346\261\202.md"
new file mode 100644
index 0000000..398e8ec
--- /dev/null
+++ "b/news-bigdataproject/1\343\200\201\351\241\271\347\233\256\351\234\200\346\261\202.md"
@@ -0,0 +1,39 @@
+﻿
+## 第一章：项目需求分析与设计
+date: 2018-12-19 11:52:11
+
+### 项目简介
+**目标**
+1、完成大数据项目的架构设计，安装部署，架构继承与开发、用户可视化交互设计
+
+2、完成实时在线数据分析
+
+3、完成离线数据分析
+
+**具体功能**
+
+1）捕获用户浏览日志信息
+
+2）实时分析前20名流量最高的新闻话题
+
+3）实时统计当前线上已曝光的新闻话题
+
+4）统计哪个时段用户浏览量最高
+
+5）报表
+
+### 项目技术点
+Hadoop2.x、Zookeeper、Flume、Hive、Hbase、Kafka、Spark2.x、SparkStreaming、MySQL、Hue、J2EE、websoket、Echarts
+
+### 开发工具
+虚拟机：  VMware、centos
+虚拟机ssh:  SecureCRT（在windows上链接多个虚拟机）
+修改源码：idea
+查看各种数据：notepad++（安装NppFTP插件，修改虚拟机中配置文件，好用的一批）
+
+### 项目架构
+图片来自于卡夫卡公司
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fyccyao7f3j30op0ee10a.jpg)
+
+
+
diff --git "a/news-bigdataproject/2\343\200\201linux\351\205\215\347\275\256.md" "b/news-bigdataproject/2\343\200\201linux\351\205\215\347\275\256.md"
new file mode 100644
index 0000000..afa8fd7
--- /dev/null
+++ "b/news-bigdataproject/2\343\200\201linux\351\205\215\347\275\256.md"
@@ -0,0 +1,106 @@
+﻿
+## 第二章：linux环境准备与设置 
+date: 2018-12-19 21:52:11
+
+
+
+### 环境简介
+利用VMware虚拟机+centos完成，基本要求笔记本电脑内存在8G以上。
+最低要去克隆出3台虚拟机，每台给2G内存。
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fycdbmkr58j30m20ckq81.jpg)
+### linux配置要点
+
+**1）设置ip地址**
+项目视频里面直接使用界面修改ip比较方便，如果Linux没有安装操作界面，需要使用命令：vi /etc/sysconfig/network-scripts/ifcfg-eth0 来修改ip地址，然后重启网络服务service network restart即可。 参考链接：[请点击。][1]
+
+**2）创建用户**
+大数据项目开发中，一般不直接使用root用户，需要我们创建新的用户来操作，比如kfk。
+a）创建用户命令：adduser kfk
+b）设置用户密码命令：passwd kfk
+
+**3）文件中设置主机名**
+Linux系统的主机名默认是localhost，显然不方便后面集群的操作，我们需要手动修改Linux系统的主机名。
+a）查看主机名命令：hostname
+b）修改主机名称
+vi /etc/sysconfig/network
+NETWORKING=yes
+HOSTNAME=bigdata-pro01.kfk.com
+
+**4）主机名映射**
+如果想通过主机名访问Linux系统，还需要配置主机名跟ip地址之间的映射关系。
+vi /etc/hosts
+192.168.31.151 bigdata-pro01.kfk.com
+配置完成之后，reboot重启Linux系统即可。
+如果需要在windows也能通过hostname访问Linux系统，也需要在windows下的hosts文件中配置主机名称与ip之间的映射关系。在windows系统下找到C:\WINDOWS\system32\drivers\etc\路径，打开HOSTS文件添加如下内容：
+192.168.31.151 bigdata-pro01.kfk.com
+
+**5）root用户下设置无密码用户切换**
+在Linux系统中操作是，kfk用户经常需要操作root用户权限下的文件，但是访问权限受限或者需要输入密码。修改/etc/sudoers这个文件添加如下代码，即可实现无密码用户切换操作。
+vi /etc/sudoers
+。。。添加如下内容即可
+kfk ALL=(root)NOPASSWD:ALL
+
+**6）关闭防火墙**
+我们都知道防火墙对我们的服务器是进行一种保护，但是有时候防火墙也会给我们带来很大的麻烦。 比如它会妨碍hadoop集群间的相互通信，所以我们需要关闭防火墙。 那么我们永久关闭防火墙的方法如下:
+vi /etc/sysconfig/selinux
+SELINUX=disabled
+保存、重启后，验证机器的防火墙是否已经关闭。
+a）查看防火墙状态：service iptables status
+b）打开防火墙：service iptables start
+c）关闭防火墙：service iptables stop
+
+**7）卸载Linux本身自带的jdk**
+一般情况下jdk需要我们手动安装兼容的版本，此时Linux自带的jdk需要手动删除掉，具体操作如下所示：
+a）查看Linux自带的jdk
+rpm -qa|grep java 
+b）删除Linux自带的jdk
+rpm -e --nodeps [jdk进程名称1 jdk进程名称2 ...]
+### 克隆虚拟机并进行相关的配置
+前面我们已经做好了Linux的系统常规设置，接下来需要克隆虚拟机并进行相关的配置。
+**1）kfk用户下创建我们将要使用的各个目录**
+```css
+软件目录
+mkdir /opt/softwares
+模块目录
+mkdir /opt/modules
+工具目录
+mkdir /opt/tools
+数据目录
+mkdir /opt/datas
+```
+**2）jdk安装(1.7以上，1.9以下)**
+大数据平台运行环境依赖JVM，所以我们需要提前安装和配置好jdk。 前面我们已经安装了64位的centos系统，所以我们的jdk也需要安装64位的，与之相匹配
+下面步骤给的是1.7的。我自己用的是jdk1.8.0_191
+a）将jdk安装包通过工具上传到/opt/softwares目录下
+b）解压jdk安装包
+
+    #解压命令
+    tar -zxf jdk-7u67-linux-x64.tar.gz /opt/modules/
+    #查看解压结果
+    ls
+    jdk1.7.0_67
+
+c）配置Java 环境变量
+
+    vi /etc/profile
+    export JAVA_HOME=/opt/modules/jdk1.7.0_67
+    export PATH=$PATH:$JAVA_HOME/bin
+
+d）查看Java是否安装成功
+
+    java -version
+    java version "1.7.0_67"
+    Java(TM) SE Runtime Environment (build 1.7.0_67-b15)
+    Java HotSpot(TM) 64-Bit Server VM (build 24.79-b02, mixed mode)
+
+**3）克隆虚拟机**
+
+在克隆虚拟机之前，需要关闭虚拟机，然后右键选中虚拟机——》选择管理——》选择克隆——》选择下一步——》选择下一步——》选择创建完整克隆，下一步——》选择克隆虚拟机位置（提前创建好），修改虚拟机名称为Hadoop-Linux-pro-2，然后选择完成即可。
+然后使用同样的方式创建第三个虚拟机Hadoop-Linux-pro-3。
+
+**4）修改克隆虚拟机配置**
+克隆完虚拟机Hadoop-Linux-pro-2和Hadoop-Linux-pro-3之后，可以按照Hadoop-Linux-pro-1的方式配置好ip地址、hostname，以及ip地址与hostname之间的关系。[参考链接][2]
+
+
+  [1]: https://www.willxu.xyz/2018/08/23/hadoop/1%E3%80%81vmware%E4%B8%8A%E7%BD%91%E9%85%8D%E7%BD%AE/
+  [2]: https://www.willxu.xyz/2018/08/23/hadoop/1%E3%80%81vmware%E4%B8%8A%E7%BD%91%E9%85%8D%E7%BD%AE/
\ No newline at end of file
diff --git "a/news-bigdataproject/3\343\200\201hadoop\351\203\250\347\275\262.md" "b/news-bigdataproject/3\343\200\201hadoop\351\203\250\347\275\262.md"
new file mode 100644
index 0000000..d005000
--- /dev/null
+++ "b/news-bigdataproject/3\343\200\201hadoop\351\203\250\347\275\262.md"
@@ -0,0 +1,113 @@
+﻿## 第三章：Hadoop2.X分布式集群部署
+date: 2018-12-19 22:52:11
+
+
+### 集群资源规划
+
+利用VMware虚拟机+centos完成，基本要求笔记本电脑内存在8G以上。
+最低要去克隆出3台虚拟机，每台给2G内存。
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fycdbmkr58j30m20ckq81.jpg)
+
+### 配置要点
+
+**（一）hadoop2.x版本下载及安装**
+官网下载2.x版本就好
+
+**（二）hadoop配置要点**
+参考官网给的例子：http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html
+网站左下角有全部配置信息
+**1）hadoop2.x分布式集群配置-HDFS**  
+安装hdfs需要修改4个配置文件：hadoop-env.sh、core-site.xml、hdfs-site.xml和slaves
+**2）hadoop2.x分布式集群配置-YARN**
+安装yarn需要修改4个配置文件：yarn-env.sh、mapred-env.sh、yarn-site.xml和mapred-site.xml
+
+**（三）分发配置到节点**
+最好先SCP设置成无密码访问，需要生成秘钥，自己百度吧
+hadoop相关配置在第一个节点配置好之后，可以通过脚本命令分发给另外两个节点即可，具体操作如下所示。
+将安装包分发给第二个节点
+scp -r hadoop-2.5.0 kaf@bigdata-pro02.kfk.com:/opt/modules/
+将安装包分发给第三个节点
+scp -r hadoop-2.5.0 kaf@bigdata-pro02.kfk.com:/opt/modules/
+
+**（四）HDFS启动集群运行测试**
+hdfs相关配置好之后，可以启动hdfs集群。
+1.格式化NameNode
+通过命令：bin/hdfs namenode -format 格式化NameNode。
+2.启动各个节点机器服务
+1）启动NameNode命令：sbin/hadoop-daemon.sh start namenode
+2) 启动DataNode命令：sbin/hadoop-daemon.sh start datanode
+3）启动ResourceManager命令：sbin/yarn-daemon.sh start resourcemanager
+4）启动NodeManager命令：sbin/yarn-daemon.sh start resourcemanager
+5）启动log日志命令：sbin/mr-jobhistory-daemon.sh start historyserver
+
+**（五）YARN集群运行MapReduce程序测试**
+前面hdfs和yarn都启动起来之后，可以通过运行WordCount程序检测一下集群是否能run起来。
+集群自带的WordCount程序执行命令：bin/yarn jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.0.jar wordcount input output
+
+**（六）ssh无秘钥登录** （可以提前设置好）
+在集群搭建的过程中，需要不同节点分发文件，那么节点间分发文件每次都需要输入密码，比较麻烦。另外在hadoop 集群启动过程中，也需要使用批量脚本统一启动各个节点服务，此时也需要节点之间实现无秘钥登录。具体操作步骤如下所示：
+1.主节点上创建 .ssh 目录，然后生成公钥文件id_rsa.pub和私钥文件id_rsa
+mkdir .ssh
+ssh-keygen -t rsa
+2.拷贝公钥到各个机器
+ssh-copy-id bigdata-pro1.kfk.com
+ssh-copy-id bigdata-pro2.kfk.com
+ssh-copy-id bigdata-pro3.kfk.com
+3.测试ssh连接
+ssh bigdata-pro1.kfk.com
+ssh bigdata-pro2.kfk.com
+ssh bigdata-pro3.kfk.com
+4.测试hdfs
+ssh无秘钥登录做好之后，可以在主节点通过一键启动命令，启动hdfs各个节点的服务，具体操作如下所示：
+sbin/start-dfs.sh
+如果yarn和hdfs主节点共用，配置一个节点即可。否则，yarn也需要单独配置ssh无秘钥登录。
+
+**（七）配置集群内机器时间同步（使用Linux ntp进行）**
+选择一台机器作为时间服务器，比如bigdata-pro1.kfk.com节点。
+1.查看ntp服务是否已经存在
+sudo rpm -qa|grep ntp
+2.ntp服务相关操作
+1）查看ntp状态
+sudo service ntpd status
+2）启动ntp
+sudo service ntpd start
+3）关闭ntp
+sudo service ntpd stop
+3.设置ntp随机器启动
+sudo chkconfig ntpd on
+4.修改ntp配置文件
+vi /etc/ntp.conf
+释放注释并将ip地址修改为
+restrict 192.168.31.151 mask 255.255.255.0 nomodify notrap
+注释掉以下命令行
+server 0.centos.pool.ntp.org iburst
+server 1.centos.pool.ntp.org iburst
+server 2.centos.pool.ntp.org iburst
+server 3.centos.pool.ntp.org iburst
+释放以下命令行
+server 127.127.1.0 #local clock
+fudge 127.127.1.0 stratum 10
+重启ntp服务
+sudo service ntpd restart
+5.修改服务器时间
+
+    #设置当前日期
+    sudo date -s 2017-06-16
+    #设置当前时间
+    sudo date -s 22:06:00
+
+6.其他节点手动同步主服务器时间
+
+    #查看ntp位置
+    which ntpdate
+    /usr/sbin/ntpdate
+    1）手动同步bigdata-pro2.kfk.com节点时间
+    sudo /usr/sbin/ntpdate bigdata-pro2.kfk.com
+    2）手动同步bigdata-pro3.kfk.com节点时间
+    sudo /usr/sbin/ntpdate bigdata-pro3.kfk.com
+    7.其他节点定时同步主服务器时间
+    bigdata-pro2.kfk.com和bigdata-pro3.kfk.com节点分别切换到root用户， 通过crontab -e 命令，每10分钟同步一次主服务器节点的时间。
+    crontab -e
+    #定时，每隔10分钟同步bigdata-pro1.kfk.com服务器时间
+    0-59/10 * * * *  /usr/sbin/ntpdate bigdata-pro1.kfk.com
+
diff --git "a/news-bigdataproject/4\343\200\201zk\351\203\250\347\275\262.md" "b/news-bigdataproject/4\343\200\201zk\351\203\250\347\275\262.md"
new file mode 100644
index 0000000..7d38196
--- /dev/null
+++ "b/news-bigdataproject/4\343\200\201zk\351\203\250\347\275\262.md"
@@ -0,0 +1,71 @@
+﻿## 第四章：Zookeeper分布式集群部署
+date: 2018-12-29 14:52:11
+
+
+### ZooKeeper简介
+ZooKeeper 是一个针对大型分布式系统的可靠协调系统；它提供的功能包括：配置维护、名字服务、分布式同步、组服务等； 它的目标就是封装好复杂易出错的关键服务，将简单易用的接口和性能高效、功能稳定的系统提供给用户； ZooKeeper 已经成为 Hadoop 生态系统中的基础组件。
+Zookeeper可以选择Apache版本，也可以选择Cloudera版本。
+1）下载Apache版本的Zookeeper。
+2）下载Cloudera版本的Zookeeper。
+
+### ZooKeeper部署步骤
+**1.下载Zookeeper**
+这里选择cdh版本的zookeeper-3.4.5-cdh5.10.0.tar.gz，将下载好的安装包上传至bigdata-pro01.kfk.com节点的/opt/softwares目录下。
+**2.解压Zookeeper**
+tar -zxf zookeeper-3.4.5-cdh5.10.0.tar.gz -C /opt/modules/
+**3.修改配置**
+1）复制配置文件
+cp conf/zoo_sample.cfg zoo.cfg
+2）修改配置文件zoo.cfg
+```xml
+vi zoo.cfg
+#这个时间是作为Zookeeper服务器之间或客户端与服务器之间维持心跳的时间间隔
+tickTime=2000
+#配置 Zookeeper 接受客户端初始化连接时最长能忍受多少个心跳时间间隔数。
+initLimit=10
+#Leader 与 Follower 之间发送消息，请求和应答时间长度
+syncLimit=5
+#数据目录需要提前创建
+dataDir=/opt/modules/zookeeper-3.4.5-cdh5.10.0/zkData
+#访问端口号
+clientPort=2181
+#server.每个节点服务编号=服务器ip地址：集群通信端口：选举端口
+server.1=bigdata-pro01.kfk.com:2888:3888
+server.2=bigdata-pro02.kfk.com:2888:3888
+server.3=bigdata-pro03.kfk.com:2888:3888
+```
+**4.分发各个节点**
+将Zookeeper安装配置分发到其他两个节点，具体操作如下所示：
+scp -r zookeeper-3.4.5-cdh5.10.0/ bigdata-pro02.kfk.com:/opt/modules/
+scp -r zookeeper-3.4.5-cdh5.10.0/ bigdata-pro03.kfk.com:/opt/modules/
+**5.创建相关目录和文件**
+1）在3个节点上分别创建数据目录
+mkdir /opt/modules/zookeeper-3.4.5-cdh5.10.0/zkData
+2）在各个节点的数据存储目录下创建myid文件，并且编辑每个机器的myid内容为
+```java
+#切换到数据目录
+cd /opt/modules/zookeeper-3.4.5-cdh5.10.0/zkData
+#bigdata-pro01.kfk.com节点
+touch myid
+vi myid
+1
+#bigdata-pro02.kfk.com节点
+touch myid
+vi myid
+2
+#bigdata-pro03.kfk.com节点
+touch myid
+vi myid
+3
+```
+**6.启动Zookeeper服务**
+1）各个节点使用如下命令启动Zookeeper服务
+bin/zkServer.sh start
+2）查看各个节点服务状态
+bin/zkServer.sh status
+不是follower
+3）关闭各个节点服务
+bin/zkServer.sh stop
+4）查看Zookeeper目录树结构
+bin/zkCli.sh
+
diff --git "a/news-bigdataproject/5\343\200\201ha\345\256\236\347\216\260.md" "b/news-bigdataproject/5\343\200\201ha\345\256\236\347\216\260.md"
new file mode 100644
index 0000000..8bf485d
--- /dev/null
+++ "b/news-bigdataproject/5\343\200\201ha\345\256\236\347\216\260.md"
@@ -0,0 +1,237 @@
+﻿## 第五章：hadoop的高可用配置（HA）
+date: 2018-12-29 15:52:11
+
+
+### HDFS-HA架构原理介绍
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fynlvh2aiyj30dp0boac3.jpg)
+
+当发生故障时，Active的 NN 挂掉后，Standby NN 会在它成为Active NN 前，读取所有的JN里面的修改日志，这样就能高可靠的保证与挂掉的NN的目录镜像树一致，然后无缝的接替它的职责，维护来自客户端请求，从而达到一个高可用的目的。
+
+### HDFS-HA修改配置文件
+1、修改hdfs-site.xml配置文件
+参照官网格式：http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html
+```xml
+<configuration>
+	<property>
+	  <name>dfs.nameservices</name>
+	  <value>ns</value>
+	</property>
+	<property>
+	  <name>dfs.ha.namenodes.ns</name>
+	  <value>nn1,nn2</value>
+	</property>
+	<property>
+	  <name>dfs.namenode.rpc-address.ns.nn1</name>
+	  <value>bigdata-pro01.kfk.com:8020</value>
+	</property>
+	<property>
+	  <name>dfs.namenode.rpc-address.ns.nn2</name>
+	  <value>bigdata-pro02.kfk.com:8020</value>
+	</property>
+	<property>
+      <name>dfs.namenode.http-address.ns.nn1</name>
+      <value>bigdata-pro01.kfk.com:50070</value>
+    </property>
+	<property>
+       <name>dfs.namenode.http-address.ns.nn2</name>
+       <value>bigdata-pro02.kfk.com:50070</value>
+    </property>
+    <property>
+        <name>dfs.namenode.shared.edits.dir</name>
+        <value>qjournal://bigdata-pro01.kfk.com:8485;bigdata-pro02.kfk.com:8485;bigdata-pro03.kfk.com:8485/ns</value>
+    </property>
+	<property>
+       <name>dfs.journalnode.edits.dir</name>
+       <value>/opt/modules/hadoop-2.6.0/data/jn</value>
+    </property>
+	<property>
+		<name>dfs.client.failover.proxy.provider.ns</name>
+		<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
+    </property>
+	<property>
+        <name>dfs.ha.automatic-failover.enabled.ns</name>
+        <value>true</value>
+    </property>
+	<property>
+		<name>dfs.ha.fencing.methods</name>
+		<value>sshfence</value>
+	</property>
+    <property>
+        <name>dfs.ha.fencing.ssh.private-key-files</name>
+        <value>/home/kfk/.ssh/id_rsa</value>
+    </property>
+	<property>
+        <name>dfs.replication</name>
+        <value>3</value>
+    </property>
+	<property>
+        <name>dfs.permissions.enabled</name>
+        <value>false</value>
+    </property>
+</configuration>
+```
+2、修改core-site.xml配置文件
+```xml
+<configuration>
+	<property>
+        <name>fs.defaultFS</name>
+        <value>hdfs://ns</value>
+	</property>
+	<property>
+        <name>hadoop.http.staticuser.user</name>
+        <value>kfk</value>
+	</property>	
+	<property>
+		<name>hadoop.tmp.dir</name>
+		<value>/opt/modules/hadoop-2.6.0/data/tmp</value>
+	</property>
+	<property>
+		<name>dfs.namenode.name.dir</name>
+		<value>file://${hadoop.tmp.dir}/dfs/name</value>
+	</property>
+	<property>
+		<name>ha.zookeeper.quorum</name>
+		<value>bigdata-pro01.kfk.com:2181,bigdata-pro02.kfk.com:2181,bigdata-pro03.kfk.com:2181</value>
+	</property>
+</configuration>
+
+```
+3、将修改的配置分发到其他节点
+```
+scp hdfs-site.xml bigdata-pro02.kfk.com:/opt/modules/hadoop-2.6.0/etc/hadoop/
+scp hdfs-site.xml bigdata-pro03.kfk.com:/opt/modules/hadoop-2.6.0/etc/hadoop/
+scp core-site.xml bigdata-pro02.kfk.com:/opt/modules/hadoop-2.6.0/etc/hadoop/
+scp core-site.xml bigdata-pro03.kfk.com:/opt/modules/hadoop-2.6.0/etc/hadoop/
+```
+### HDFS-HA自动故障转移测试
+1、在所有节点启动zookeeper
+ cd /opt/modules/zookeeper-3.4.5-cdh5.10.0/
+ sbin/zkServer.sh start
+ bin/hdfs zkfc -formatZK            （第一次使用zkfc需要格式化）
+2、启动hdfs
+ bin/hdfs namenode -format          （第一次使用hdfs需要格式化，在namenode）
+ sbin/start-dfs.sh                    （会在各个节点上启动namenode/datanode/journalnode）
+3、在HA的namenode节点上启动zkfc线程（两个namenode都要启动）
+ sbin/hadoop-daemon.sh start zkfc
+ 查看两个namenode状态一个是active(先启动zkfc的)，一个是standy，查看网页。
+ http://bigdata-pro01.kfk.com:50070
+ http://bigdata-pro02.kfk.com:50070
+4、上传文件到hdfs
+ bin/hdfs dfs -mkdir /usr
+ bin/hdfs dfs -put /opt/modules/hadoop-2.6.0/etc/hadoop/hdfs-site.xml /usr
+ 在网页中可以看到
+5、杀死active的namenode
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fynmg2l3ptj30au032dfq.jpg)
+6、再次查看namenode状态
+应该完成了主备切换。原来的standy变成了active.
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fynmjhur96j30py096t94.jpg)
+### HDFS-HA所遇到的问题（看输出日志和查看日志）
+**1、输出提示：无法解析bigdata-pro03.kfk.com:2181**
+原因：因为我的core-site.xml配置文件写错了,参数一栏不能有换行，要不然读的不对的。
+```xml
+	<property>
+		<name>ha.zookeeper.quorum</name>
+		<value>bigdata-pro01.kfk.com:2181,bigdata-pro02.kfk.com:2181,bigdata-pro03.kfk.com:2181</value>
+	</property>
+```
+**2、 sbin/start-dfs.sh   启动不成功**
+因为这个启动需要配置ssh，所以
+（1）在节点1上
+ssh-keygen
+ssh-copy-id bigdata-pro1.kfk.com   (包括自己的也要ssh)
+ssh-copy-id bigdata-pro2.kfk.com
+ssh-copy-id bigdata-pro3.kfk.com
+（2）测试ssh连接
+ssh bigdata-pro1.kfk.com
+ssh bigdata-pro2.kfk.com
+ssh bigdata-pro3.kfk.com
+**3、 namenode准备切换失败**
+bigdata-pro1.kfk.com可以竞选成active，但是杀掉bigdata-pro1.kfk.com，而bigdata-pro2.kfk.com不会竞选成active，仍然是standby。
+查看bigdata-pro2.kfk.com日志：
+ tail -10f hadoop-kfk-zkfc-bigdata-pro02.kfk.com.log
+ ![](http://ww1.sinaimg.cn/large/005BOtkIly1fynn11ym39j30q10ftq5a.jpg)
+ **红线部分说明，在bigdata-pro2.kfk.com准备选举时，需要对pro1进行fence，但是失败了，原因是ssh失败，说明在节点2上没法ssh到节点1上，所以需要在节点2上进行ssh-keygen,然后拷贝到节点1，这样就解决了**
+ ![](http://ww1.sinaimg.cn/large/005BOtkIly1fynn4p3524j30qd070aat.jpg)
+ 
+
+
+----------
+
+### YARN-HA架构原理及介绍
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fynn8cje51j30hp0azdhq.jpg)
+ResourceManager HA 由一对Active，Standby结点构成，通过RMStateStore存储内部数据和主要应用的数据及标记。
+目前支持的可替代的RMStateStore实现有：基于内存的MemoryRMStateStore，基于文件系统的FileSystemRMStateStore，及基于zookeeper的ZKRMStateStore。 
+ResourceManager HA的架构模式同NameNode HA的架构模式基本一致，数据共享由RMStateStore，而ZKFC成为 ResourceManager进程的一个服务，非独立存在。
+### YARN-HA配置文件
+**1、修改yarn-site.xml配置文件**
+```xml
+<configuration>
+    <property>
+        <name>yarn.nodemanager.aux-services</name>
+        <value>mapreduce_shuffle</value>
+    </property>
+	<property>
+        <name>yarn.log-aggregation-enable</name>
+        <value>true</value>
+    </property>
+	<property>
+        <name>yarn.log-aggregation.retain-seconds</name>
+        <value>10000</value>
+    </property>
+	<property>
+		<name>yarn.resourcemanager.ha.enabled</name>
+		<value>true</value>
+	</property>
+	<property>
+		<name>yarn.resourcemanager.cluster-id</name>
+		<value>rs</value>
+	</property>
+	<property>
+		<name>yarn.resourcemanager.ha.rm-ids</name>
+		<value>rm1,rm2</value>
+	</property>
+	<property>
+		<name>yarn.resourcemanager.hostname.rm1</name>
+		<value>bigdata-pro01.kfk.com</value>
+	</property>
+	<property>
+		<name>yarn.resourcemanager.hostname.rm2</name>
+		<value>bigdata-pro02.kfk.com</value>
+	</property>
+	<property>
+		  <name>yarn.resourcemanager.zk-address</name>
+		  <value>bigdata-pro01.kfk.com:2181,bigdata-pro02.kfk.com:2181,bigdata-pro03.kfk.com:2181</value>
+	</property>
+	<property>
+		<name>yarn.resourcemanager.recovery.enabled</name>
+		<value>true</value>
+	</property>
+	<property>
+		<name>yarn.resourcemanager.store.class</name>
+		<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
+	</property>	
+
+	
+
+</configuration>
+
+```
+**2、分发至其他节点**
+scp yarn-site.xml bigdata-pro02.kfk.com:/opt/modules/hadoop-2.6.0/etc/hadoop/
+scp yarn-site.xml bigdata-pro03.kfk.com:/opt/modules/hadoop-2.6.0/etc/hadoop/
+### YARN-HA故障转移测试
+1、在rm1节点上启动yarn服务
+ sbin/start-yarn.sh
+2、在rm2节点上启动ResourceManager服务
+sbin/yarn-daemon.sh start resourcemanager
+3、查看yarn的web界面
+http://bigdata-pro01.kfk.com:8088
+http://bigdata-pro02.kfk.com:8088
+4、上传wordcount所需的文件到hdfs并执行MapReduce例子
+bin/hdfs dfs -put data/wc  /usr/kfk/data  
+bin/yarn jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar wordcount /usr/kfk/data/wc /usr/kfk/data/wc.out
+5、执行到一半的时候，kill掉rm1上的resourcemanager
+任务会转移到rm2继续处理
+这是bigdata-pro01.kfk.com输出的日志（额外打开一个bigdata-pro01.kfk.com进行kill）
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fynpuokvgdj30mb01xjrd.jpg)
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fynpxiwdvhj31cb0b2jsz.jpg)
\ No newline at end of file
diff --git "a/news-bigdataproject/6\343\200\201hbase\351\203\250\347\275\262.md" "b/news-bigdataproject/6\343\200\201hbase\351\203\250\347\275\262.md"
new file mode 100644
index 0000000..23d2635
--- /dev/null
+++ "b/news-bigdataproject/6\343\200\201hbase\351\203\250\347\275\262.md"
@@ -0,0 +1,78 @@
+﻿## 第六章：hadoop的HA下的高可用HBase部署
+date: 2018-12-30 15:52:11
+
+
+### HBase简介与设计
+HBase是一个高可靠、高性能、面向列、可伸缩的分布式存储系统，利用Hbase技术可在廉价PC Server上搭建 大规模非结构化存储集群。底层就是在hdfs一个目录。
+下载Apache版本的HBase：https://archive.apache.org/dist/
+下载Cloudera版本的HBase：http://archive-primary.cloudera.com/cdh5/cdh/5/
+这里选择Cloudera是hbase-1.0.0-cdh5.4.0.tar.gz
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fynxysp89aj30js08mn04.jpg)
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fynyfa2lptj30y10esgm9.jpg)
+### HBase安装与部署
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fynxzha2acj30hr053weh.jpg)
+1、解压安装到/opt/modules/
+2、修改配置文件
+**a.hbase-env.sh**
+配置jdk
+export JAVA_HOME=/opt/modules/jdk1.8.0_191
+使用外部的Zookeeper
+export HBASE_MANAGES_ZK=false
+**b.hbase-site.xml**
+这里采用hadoop高可用下的配置
+```xml
+<configuration>
+	<property>
+    		<name>hbase.rootdir</name>
+    		<value>hdfs://ns/hbase</value>
+	</property>
+	<property>
+    		<name>hbase.cluster.distributed</name>
+    		<value>true</value>
+	</property>
+	<property>
+		<name>hbase.zookeeper.quorum</name>
+		<value>bigdata-pro01.kfk.com,bigdata-pro02.kfk.com,bigdata-pro03.kfk.com</value>
+	</property>
+</configuration>
+```
+**c.regionservers**
+bigdata-pro01.kfk.com
+bigdata-pro02.kfk.com
+bigdata-pro03.kfk.com
+
+**3、将hadoop中hdfs-site.xml和core-site.xml拷贝到hbase的conf下**
+要不然会启动失败，具体日志如下：不认识ns，因为ns在hadoop中配置的
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fyny4rdsrfj30pu0cjdh8.jpg)
+4、将hbase配置分发到各个节点
+scp -r hbase-1.0.0-cdh5.4.0 bigdata-pro02.kfk.com:/opt/modules/
+scp -r hbase-1.0.0-cdh5.4.0 bigdata-pro03.kfk.com:/opt/modules/
+
+### HBase启动与测试
+1、先启动zookeeper
+    zkServer.sh start
+2、启动高可用下的hdfs
+    sbin/start-dfs.sh （会在各个节点上启动namenode/datanode/journalnode）
+在HA的namenode节点上启动zkfc线程（两个namenode都要启动）
+sbin/hadoop-daemon.sh start zkfc
+3、启动hbase
+bin/start-hbase.sh
+4、查看HBase Web界面
+bigdata-pro01.kfk.com:60010/
+5、HBase的master高可用测试
+```
+在bigdata-pro02.kfk.com上启动master,
+./hbase-daemon.sh start master
+然后杀死bigdata-pro01.kfk.com的Hmaster
+zookeeper会自动切换master
+```
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fynycepyczj30xp0cl3zb.jpg)
+
+### HBase的shell测试
+1、启动shell
+bin/hbase shell
+2、创建表
+create 'weblogs','info'
+3、列出表
+list
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fyososol82j306q02o0sk.jpg)
\ No newline at end of file
diff --git "a/news-bigdataproject/7\343\200\201kafka\351\203\250\347\275\262.md" "b/news-bigdataproject/7\343\200\201kafka\351\203\250\347\275\262.md"
new file mode 100644
index 0000000..27e2af7
--- /dev/null
+++ "b/news-bigdataproject/7\343\200\201kafka\351\203\250\347\275\262.md"
@@ -0,0 +1,87 @@
+﻿## 第七章：Kafka简介和分布式部署
+date: 2019-1-1 21:13:01
+
+
+### Kafka简介
+Kafka是一个分布式的消息系统，使用Scala编写，可水平扩展和高吞吐率而被广泛使用。
+目前越来越多的开源分布式处理系统如Cloudera、Apache Storm、Spark都支持与Kafka集成。
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fyrd3u03srj30ja09k791.jpg)
+图中名词介绍参考这边：https://www.cnblogs.com/hei12138/p/7805475.html
+**理解zookeeper在其中的作用，简单了解生产和消费的之间的联系。**
+官网下载：http://kafka.apache.org/
+这里采用：kafka_2.10-0.9.0.0.tgz
+
+### Kafka分布式部署
+1、解压
+tar -zxf kafka_2.10-0.9.0.0.tgz  -C /opt/modules/
+2、配置server.properties文件
+```propertis
+#节点唯一标识
+broker.id=1
+
+listeners=PLAINTEXT://bigdata-pro01.kfk.com:9092
+#默认端口号
+port=9092
+#主机名绑定
+host.name=bigdata-pro01.kfk.com
+#Kafka数据目录
+log.dirs=/opt/modules/kafka_2.10-0.9.0.0/kafka-logs
+#配置Zookeeper
+zookeeper.connect=bigdata-pro01.kfk.com:2181,bigdata-pro02.kfk.com:2181,bigdata-pro03.kfk.com:2181
+
+```
+3、配置zookeeper.properties文件
+```propertis
+#Zookeeper的数据存储路径与Zookeeper集群配置保持一致
+dataDir=/opt/modules/zookeeper-3.4.5-cdh5.10.0/zkData
+
+```
+
+4、配置consumer.properties文件
+```
+#配置Zookeeper地址
+zookeeper.connect=bigdata-pro01.kfk.com:2181,bigdata-pro02.kfk.com:2181,bigdata-pro03.kfk.com:2181
+```
+5、配置producer.properties文件
+```
+#配置Kafka集群地址  ,分布在三台机器上
+metadata.broker.list=bigdata-pro01.kfk.com:9092,bigdata-pro02.kfk.com:9092,bigdata-pro03.kfk.com:9092
+```
+6、拷贝
+scp -r kafka_2.10-0.9.0.0 bigdata-pro02.kfk.com:/opt/modules/
+scp -r kafka_2.10-0.9.0.0 bigdata-pro03.kfk.com:/opt/modules/
+7、修改另外两个节点的server.properties
+```
+#bigdata-pro02.kfk.com节点
+broker.id=2
+listeners=PLAINTEXT://bigdata-pro02.kfk.com:9092
+host.name=bigdata-pro02.kfk.com
+#bigdata-pro03.kfk.com节点
+broker.id=3
+listeners=PLAINTEXT://bigdata-pro03.kfk.com:9092
+host.name=bigdata-pro03.kfk.com
+
+```
+
+### kafka测试
+```
+1、所有节点启动zk
+bin/zkServer.sh start
+2、各个节点启动Kafka集群
+bin/kafka-server-start.sh config/server.properties &
+3、创建topic
+bin/kafka-topics.sh --zookeeper localhost:2181 --create --topic test --replication-factor 1 --partitions 1
+4、查看topic
+bin/kafka-topics.sh --zookeeper localhost:2181 –list
+
+bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic test
+结果：
+        Topic:test      PartitionCount:1        ReplicationFactor:1     Configs:
+        Topic: test     Partition: 0    Leader: 2       Replicas: 2     Isr: 2
+5、生产者生产数据（节点1）
+bin/kafka-console-producer.sh --broker-list bigdata-pro01.kfk.com:9092 --topic test
+6、消费者消费数据（节点2）
+bin/kafka-console-consumer.sh --zookeeper bigdata-pro02.kfk.com:2181 --topic test --from-beginning
+```
+额外说下分区和消费关系：（有点不好理解，问题不大）
+一个主题可以有多个分区，具体分区方法有多种；关于消费，有消费组的概念。一种是指定消费组（每个消费者的组名一致），那么每个分区对应一个消费者；二是指定消费组（每个消费者的组名不一致），那么所有分区每个消息都会送至各个小组的消费者；三是不指定消费组，那么每条消息会发给消费组中一个消费者。
\ No newline at end of file
diff --git "a/news-bigdataproject/8\343\200\201flume\351\203\250\347\275\262.md" "b/news-bigdataproject/8\343\200\201flume\351\203\250\347\275\262.md"
new file mode 100644
index 0000000..5acd4e2
--- /dev/null
+++ "b/news-bigdataproject/8\343\200\201flume\351\203\250\347\275\262.md"
@@ -0,0 +1,78 @@
+﻿## 第八章：Flume简介和分布式部署
+date: 2019-1-1 21:30:01
+
+
+### Flume简介
+
+Flume是Cloudera提供的一个高可用的，高可靠的，分布式的海量日志采集、聚合和传输的系统，Flume支持在日志系统中定制各类数据发送方，用于收集数据；同时，Flume提供对数据进行简单处理，并写到各种数据接受方（可定制）的能力。
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fyre9dzhbfj30m208g75x.jpg)
+入门学习就看这篇就好：https://www.cnblogs.com/zhangyinhua/p/7803486.html#_label0
+
+下载版本：下载Apache版本的apache-flume-1.7.0-bin.tar.gz 
+**关于Flume在项目中的说明**
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fyrego248hj30go0gp43u.jpg)
+我们这次项目中共采用三个Flume。
+两个节点采集(节点2,3)，一个节点把采集的汇总并分发至kafka和hbase（节点1）.
+### Flume部署
+**（这里先部署和设置了节点2和3采集部分，节点1的汇总分发后面继续）**
+每一步都可以去查官方资料：官方地址：http://flume.apache.org/
+
+1、解压Flume
+tar -zxf apache-flume-1.7.0-bin.tar.gz  -C /opt/modules/
+```
+vi flume-env.sh
+配置下环境变量问题
+export JAVA_HOME=/opt/modules/jdk1.8.0_191
+export HADOOP_HOME=/opt/modules/hadoop-2.6.0
+export HBASE_HOME=/opt/modules/hbase-1.0.0-cdh5.4.0
+```
+2、将flume分发到其他两个节点
+scp -r flume-1.7.0-bin bigdata-pro02.kfk.com:/opt/modules/
+scp -r flume-1.7.0-bin bigdata-pro03.kfk.com:/opt/modules/
+3、flume agent-2采集节点服务配置（在bigdata-pro02.kfk.com）
+三个部分：sources、channels、sinks
+/opt/datas/weblogs.log是我们要采集的日志
+```
+vi flume-conf.properties
+
+agent2.sources = r1
+agent2.channels = c1
+agent2.sinks = k1
+
+agent2.sources.r1.type = exec
+agent2.sources.r1.command = tail -F /opt/datas/weblog-flume.log
+agent2.sources.r1.channels = c1
+
+agent2.channels.c1.type = memory
+agent2.channels.c1.capacity = 10000
+agent2.channels.c1.transactionCapacity = 10000
+agent2.channels.c1.keep-alive = 5
+
+agent2.sinks.k1.type = avro
+agent2.sinks.k1.channel = c1
+agent2.sinks.k1.hostname = bigdata-pro01.kfk.com
+agent2.sinks.k1.port = 5555
+```
+4、flume agent-3采集节点服务配置（在bigdata-pro03.kfk.com）
+
+```
+vi flume-conf.properties
+
+agent3.sources = r1
+agent3.channels = c1
+agent3.sinks = k1
+
+agent3.sources.r1.type = exec
+agent3.sources.r1.command = tail -F /opt/datas/weblog-flume.log
+agent3.sources.r1.channels = c1
+
+agent3.channels.c1.type = memory
+agent3.channels.c1.capacity = 10000
+agent3.channels.c1.transactionCapacity = 10000
+agent3.channels.c1.keep-alive = 5
+
+agent3.sinks.k1.type = avro
+agent3.sinks.k1.channel = c1
+agent3.sinks.k1.hostname = bigdata-pro01.kfk.com
+agent3.sinks.k1.port = 5555
+```
diff --git "a/news-bigdataproject/9\343\200\201flume-hbase-kfk\351\205\215\347\275\256.md" "b/news-bigdataproject/9\343\200\201flume-hbase-kfk\351\205\215\347\275\256.md"
new file mode 100644
index 0000000..e254f9b
--- /dev/null
+++ "b/news-bigdataproject/9\343\200\201flume-hbase-kfk\351\205\215\347\275\256.md"
@@ -0,0 +1,117 @@
+﻿## 第九章：Flume源码修改与HBase+Kafka集成
+date: 2019-1-20 11:30:01
+
+
+### 如何修改flume源码？
+因为我们需要在节点1上将flume同时发送至Hbase以及kafka，但是hbase结构需要自定义，所以由flume发送至hbase代码需要进行修改。
+项目源码：https://github.com/changeforeda/Big-Data-Project/tree/master/code/flume-ng-sinks
+步骤：  
+1.下载Flume源码并导入Idea开发工具
+1）将apache-flume-1.7.0-src.tar.gz源码下载到本地解压
+2）通过idea导入flume源码
+打开idea开发工具，选择File——》Open，找到源码包，选中flume-ng-hbase-sink，点击ok加载相应模块的源码。
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fzd1atesvcj30dw0fmaae.jpg)
+2、自己写个类完成类的修改。KfkAsyncHbaseEventSerializer这个是我自定义的。修改其中的下面这个方法。
+```java
+@Override
+    public List<PutRequest> getActions() {
+        List<PutRequest> actions = new ArrayList<>();
+        if (payloadColumn != null) {
+            byte[] rowKey;
+            try {
+                /*---------------------------代码修改开始---------------------------------*/
+                //解析列字段
+                String[] columns = new String(this.payloadColumn).split(",");
+                //解析flume采集过来的每行的值
+                String[] values = new String(this.payload).split(",");
+                for(int i=0;i < columns.length;i++) {
+                    byte[] colColumn = columns[i].getBytes();
+                    byte[] colValue = values[i].getBytes(Charsets.UTF_8);
+
+                    //数据校验：字段和值是否对应
+                    if (colColumn.length != colValue.length) break;
+
+                    //时间
+                    String datetime = values[0].toString();
+                    //用户id
+                    String userid = values[1].toString();
+                    //根据业务自定义Rowkey
+                    rowKey = SimpleRowKeyGenerator.getKfkRowKey(userid, datetime);
+                    //插入数据
+                    PutRequest putRequest = new PutRequest(table, rowKey, cf,
+                            colColumn, colValue);
+                    actions.add(putRequest);
+                    /*---------------------------代码修改结束---------------------------------*/
+                }
+            } catch (Exception e) {
+                throw new FlumeException("Could not get row key!", e);
+            }
+        }
+        return actions;
+    }
+```
+修改这个类中自定义KEY生成方法
+```java
+public class SimpleRowKeyGenerator {
+
+  public static byte[] getKfkRowKey(String userid,String datetime)throws UnsupportedEncodingException {
+    return (userid + datetime + String.valueOf(System.currentTimeMillis())).getBytes("UTF8");
+  }
+}
+```
+3、应该进行测试，但是这边测试完成，目前不知如何搭建，就直接生成jar包放到虚拟机直接用了。
+4、生成jar包，idea很好用
+可参考：https://jingyan.baidu.com/article/c275f6ba0bbb65e33d7567cb.html
+1）在idea工具中，选择File——》ProjectStructrue
+2）左侧选中Artifacts，然后点击右侧的+号，最后选择JAR——》From modules with dependencies
+3）一定要设置main class这一项选择自己要打包的类，然后直接点击ok
+4）删除其他依赖包，只把flume-ng-hbase-sink打成jar包就可以了。
+5）然后依次点击apply，ok
+6）点击build进行编译，会自动打成jar包
+7）到项目的apache-flume-1.7.0-src\flume-ng-sinks\flume-ng-hbase-sink\classes\artifacts\flume_ng_hbase_sink_jar目录下找到刚刚打的jar包
+8）将打包名字替换为flume自带的包名flume-ng-hbase-sink-1.7.0.jar ，然后上传至虚拟机上flume/lib目录下，覆盖原有的jar包即可。
+### 修改flume配置
+这里在节点1上修改flume的配置，完成与hbase和kafka的集成。（flume自定义的jar已经上传覆盖）
+修改flume-conf.properties
+```
+agent1.sources = r1
+agent1.channels = kafkaC hbaseC 
+agent1.sinks =  kafkaSink hbaseSink
+
+agent1.sources.r1.type = avro
+agent1.sources.r1.channels = hbaseC kafkaC
+agent1.sources.r1.bind = bigdata-pro01.kfk.com
+agent1.sources.r1.port = 5555
+agent1.sources.r1.threads = 5
+# flume-hbase
+agent1.channels.hbaseC.type = memory
+agent1.channels.hbaseC.capacity = 100000
+agent1.channels.hbaseC.transactionCapacity = 100000
+agent1.channels.hbaseC.keep-alive = 20
+
+agent1.sinks.hbaseSink.type = asynchbase
+agent1.sinks.hbaseSink.table = weblogs
+agent1.sinks.hbaseSink.columnFamily = info
+agent1.sinks.hbaseSink.channel = hbaseC
+agent1.sinks.hbaseSink.serializer = org.apache.flume.sink.hbase.KfkAsyncHbaseEventSerializer
+agent1.sinks.hbaseSink.serializer.payloadColumn = datatime,userid,searchname,retorder,cliorder,cliurl
+#flume-kafka
+agent1.channels.kafkaC.type = memory
+agent1.channels.kafkaC.capacity = 100000
+agent1.channels.kafkaC.transactionCapacity = 100000
+agent1.channels.kafkaC.keep-alive = 20
+
+agent1.sinks.kafkaSink.channel = kafkaC
+agent1.sinks.kafkaSink.type = org.apache.flume.sink.kafka.KafkaSink
+agent1.sinks.kafkaSink.brokerList = bigdata-pro01.kfk.com:9092,bigdata-pro02.kfk.com:9092,bigdata-pro03.kfk.com:9092
+agent1.sinks.kafkaSink.topic = weblogs
+agent1.sinks.kafkaSink.zookeeperConnect = bigdata-pro01.kfk.com:2181,bigdata-pro02.kfk.com:2181,bigdata-pro03.kfk.com:2181
+agent1.sinks.kafkaSink.requiredAcks = 1
+agent1.sinks.kafkaSink.batchSize = 1
+agent1.sinks.kafkaSink.serializer.class = kafka.serializer.StringEncoder
+```
+
+### 小结
+项目进行到这里，已经完成了节点2和节点3上flume采集配置、节点1上flume采集并发送至kafka和hbase配置。
+如下图，这部分都已经完成，下一章进行联调。加油！！！！
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fzd2e99ywhj30go0gp43u.jpg)
\ No newline at end of file
diff --git a/news-project.md b/news-project.md
new file mode 100644
index 0000000..412c307
--- /dev/null
+++ b/news-project.md
@@ -0,0 +1,119 @@
+﻿## 项目名称：基于Spark2.x新闻网大数据实时分析可视化系统项目
+
+### 项目简介
+
+**目标**
+
+1、完成大数据项目的架构设计，安装部署，架构继承与开发、用户可视化交互设计
+
+2、完成实时在线数据分析
+
+3、完成离线数据分析
+
+**具体功能**
+
+1、捕获用户浏览日志信息
+
+2、实时分析前20名流量最高的新闻话题
+
+3、实时统计当前线上已曝光的新闻话题
+
+4、统计哪个时段用户浏览量最高
+
+5、报表
+
+**所用组件**
+Hadoop2.x、Zookeeper、Flume、Hive、Hbase、Kafka、Spark2.x、SparkStreaming、MySQL、Hue、J2EE、websoket、Echarts
+
+### 开发工具
+
+虚拟机：  VMware、centos
+
+虚拟机SSH:  SecureCRT（在windows上链接多个虚拟机）
+
+程序编辑器：IDEA
+
+查看各种数据：notepad++（安装NppFTP插件，修改虚拟机中配置文件，好用的一批）
+
+**所有软件下载地址：**
+
+链接：https://pan.baidu.com/s/18wrxmczkzgoNE2WTZwjPSA 
+提取码：73q8 
+
+
+### 项目架构
+
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fyccyao7f3j30op0ee10a.jpg)
+
+### 集群资源规划
+
+利用VMware虚拟机+centos完成，基本要求笔记本电脑内存在8G以上。
+最低要去克隆出3台虚拟机，每台给2G内存。
+![](http://ww1.sinaimg.cn/large/005BOtkIly1fycdbmkr58j30m20ckq81.jpg)
+
+### 项目实现步骤
+
+[1、第一章：项目需求分析与设计][1]
+
+[2、第二章：linux环境准备与设置][2]
+
+[3、第三章：Hadoop2.X分布式集群部署][3]
+
+[4、第四章：Zookeeper分布式集群部署][4]
+
+[5、第五章：hadoop的高可用配置（HA）][5]
+
+[6、第六章：hadoop的HA下的高可用HBase部署][6]
+
+[7、第七章：Kafka简介和分布式部署][7]
+
+[8、第八章：Flume简介和分布式部署][8]
+
+[9、第九章：Flume源码修改与HBase+Kafka集成][9]
+
+[10、第十章：Flume+HBase+Kafka集成全流程测试][10]
+
+[11、第十一章：mysql、Hive安装与集成][11]
+
+[12、第十二章：Hive与Hbase集成][12]
+
+[13、第十三章：Cloudera HUE大数据可视化分析][13]
+
+[14、第十四章：Spark2.X集群安装与spark on yarn部署][14]
+
+[15、第十五章：基于IDEA环境下的Spark2.X程序开发][15]
+
+[16、第十六章：Spark Streaming实时数据处理][16]
+
+### 项目配套视频
+
+链接：https://pan.baidu.com/s/1Q-XGRjRwyVa0UFSzfbjFdQ 
+
+提取码：qart 
+
+### 群内有更多相关电子书籍和1000G网盘资料
+本QQ群用于求职交流、技术探讨以及TALKDATA最新面经动态分享等。
+
+![image](image/qqqun.jpg)
+
+![](https://ftp.bmp.ovh/imgs/2020/01/2c05f26fe8c5546d.png)
+
+![uLIqN4.png](https://s2.ax1x.com/2019/10/12/uLIqN4.png)
+
+
+  [1]: https://github.com/TALKDATA/JavaBigData/blob/master/news-bigdataproject/1%E3%80%81%E9%A1%B9%E7%9B%AE%E9%9C%80%E6%B1%82.md
+  [2]: https://github.com/TALKDATA/JavaBigData/blob/master/news-bigdataproject/2%E3%80%81linux%E9%85%8D%E7%BD%AE.md
+  [3]: https://github.com/TALKDATA/JavaBigData/blob/master/news-bigdataproject/3%E3%80%81hadoop%E9%83%A8%E7%BD%B2.md
+  [4]: https://github.com/TALKDATA/JavaBigData/blob/master/news-bigdataproject/4%E3%80%81zk%E9%83%A8%E7%BD%B2.md
+  [5]: https://github.com/TALKDATA/JavaBigData/blob/master/news-bigdataproject/5%E3%80%81ha%E5%AE%9E%E7%8E%B0.md
+  [6]: https://github.com/TALKDATA/JavaBigData/blob/master/news-bigdataproject/6%E3%80%81hbase%E9%83%A8%E7%BD%B2.md
+  [7]: https://github.com/TALKDATA/JavaBigData/blob/master/news-bigdataproject/7%E3%80%81kafka%E9%83%A8%E7%BD%B2.md
+  [8]: https://github.com/TALKDATA/JavaBigData/blob/master/news-bigdataproject/8%E3%80%81flume%E9%83%A8%E7%BD%B2.md
+  [9]: https://github.com/TALKDATA/JavaBigData/blob/master/news-bigdataproject/9%E3%80%81flume-hbase-kfk%E9%85%8D%E7%BD%AE.md
+  [10]: https://github.com/TALKDATA/JavaBigData/blob/master/news-bigdataproject/10%E3%80%81flume-hbase-kfk%E8%81%94%E8%B0%83.md
+  [11]: https://github.com/TALKDATA/JavaBigData/blob/master/news-bigdataproject/11%E3%80%81mysql-hive.md
+  [12]: https://github.com/TALKDATA/JavaBigData/blob/master/news-bigdataproject/12%E3%80%81hive-hbase.md
+  [13]: https://github.com/TALKDATA/JavaBigData/blob/master/news-bigdataproject/13%E3%80%81hue.md
+  [14]: https://github.com/TALKDATA/JavaBigData/blob/master/news-bigdataproject/14%E3%80%81spark%20on%20yarn.md
+  [15]: https://github.com/TALKDATA/JavaBigData/blob/master/news-bigdataproject/15%E3%80%81spark-idea.md
+  [16]: https://github.com/TALKDATA/JavaBigData/blob/master/news-bigdataproject/16%E3%80%81spark-streaming1.md
\ No newline at end of file
diff --git "a/study/Lombok\344\275\277\347\224\250.md" "b/study/Lombok\344\275\277\347\224\250.md"
new file mode 100644
index 0000000..5a0de9d
--- /dev/null
+++ "b/study/Lombok\344\275\277\347\224\250.md"
@@ -0,0 +1,183 @@
+﻿## Lombok使用
+
+官方文档：https://projectlombok.org/features/all
+### 添加依赖
+```xml
+        <dependency>
+            <groupId>org.projectlombok</groupId>
+            <artifactId>lombok</artifactId>
+            <version>1.18.12</version>
+            <scope>provided</scope>
+        </dependency>
+```
+
+### IDEA中安装Lombok插件
+不安装的话，无法在编写代码时使用，会提示出错。
+一定要勾选 Enable annotation processing，否则无法通过编译
+
+### 常用注解
+```
+@Setter ：注解在属性上；为属性提供 setting 方法
+@Setter ：注解在属性上；为属性提供 getting 方法
+@Log4j2 ：注解在类上；为类提供一个 属性名为log 的 log4j2 日志对象
+@NoArgsConstructor ：注解在类上；为类提供一个无参的构造方法
+@AllArgsConstructor ：注解在类上；为类提供一个全参的构造方法
+@Builder ： 被注解的类加个构造者模式
+@NonNull : 如果给参数加个这个注解 参数为null会抛出空指针异常
+@Cleanup : 可以关闭流
+```
+
+### 使用案例
+```java
+@NoArgsConstructor
+@AllArgsConstructor
+@Setter
+@Getter
+@Builder
+@Log4j2   //前提是已经引入log4j2
+@ToString
+public class StudyLombok {
+
+    @Builder.Default  //使用无参构造器时，设置默认值
+    private int var1 =10;
+
+    private String var2;
+
+    public void logLombok(){
+        log.warn("lombok log warn:"+toString());
+    }
+
+}
+```
+
+生成的class文件
+```class
+//
+// Source code recreated from a .class file by IntelliJ IDEA
+// (powered by Fernflower decompiler)
+//
+
+package com.alipay;
+
+import org.apache.logging.log4j.LogManager;
+import org.apache.logging.log4j.Logger;
+
+public class StudyLombok {
+    private static final Logger log = LogManager.getLogger(StudyLombok.class);
+    private int var1;
+    private String var2;
+
+    public void logLombok() {
+        log.warn("lombok log warn:" + this.toString());
+    }
+
+    private static int $default$var1() {
+        return 10;
+    }
+
+    public static StudyLombok.StudyLombokBuilder builder() {
+        return new StudyLombok.StudyLombokBuilder();
+    }
+
+    public StudyLombok() {
+        this.var1 = $default$var1();
+    }
+
+    public StudyLombok(int var1, String var2) {
+        this.var1 = var1;
+        this.var2 = var2;
+    }
+
+    public void setVar1(int var1) {
+        this.var1 = var1;
+    }
+
+    public void setVar2(String var2) {
+        this.var2 = var2;
+    }
+
+    public int getVar1() {
+        return this.var1;
+    }
+
+    public String getVar2() {
+        return this.var2;
+    }
+
+    public String toString() {
+        return "StudyLombok(var1=" + this.getVar1() + ", var2=" + this.getVar2() + ")";
+    }
+
+    public static class StudyLombokBuilder {
+        private boolean var1$set;
+        private int var1$value;
+        private String var2;
+
+        StudyLombokBuilder() {
+        }
+
+        public StudyLombok.StudyLombokBuilder var1(int var1) {
+            this.var1$value = var1;
+            this.var1$set = true;
+            return this;
+        }
+
+        public StudyLombok.StudyLombokBuilder var2(String var2) {
+            this.var2 = var2;
+            return this;
+        }
+
+        public StudyLombok build() {
+            int var1$value = this.var1$value;
+            if (!this.var1$set) {
+                var1$value = StudyLombok.$default$var1();
+            }
+
+            return new StudyLombok(var1$value, this.var2);
+        }
+
+        public String toString() {
+            return "StudyLombok.StudyLombokBuilder(var1$value=" + this.var1$value + ", var2=" + this.var2 + ")";
+        }
+    }
+}
+
+```
+### 测试类
+
+```java
+    public static void main(String[] args) {
+
+        //使用无参构造器创造对象，默认参数有指定
+        StudyLombok studyLombok = StudyLombok.builder().build();
+        studyLombok.logLombok();
+        StudyLombok studyLombok1 = new StudyLombok();
+        studyLombok1.logLombok();
+
+        //使用全参构造器创造对象
+        StudyLombok studyLombok2 = StudyLombok.builder().var1(111).var2("111").build();
+        studyLombok2.logLombok();
+        StudyLombok studyLombok3 = new StudyLombok(111,"111");
+        studyLombok3.logLombok();
+
+        //使用部分参数构造器创造对象，利用建造者模式
+        StudyLombok studyLombok4 = StudyLombok.builder().var2("111").build();
+        studyLombok4.logLombok();
+
+
+     }
+```
+控制台日志输出
+```
+14:59:40.657 [main] WARN  com.alipay.StudyLombok - lombok log warn:StudyLombok(var1=10, var2=null)
+14:59:40.662 [main] WARN  com.alipay.StudyLombok - lombok log warn:StudyLombok(var1=10, var2=null)
+14:59:40.663 [main] WARN  com.alipay.StudyLombok - lombok log warn:StudyLombok(var1=111, var2=111)
+14:59:40.663 [main] WARN  com.alipay.StudyLombok - lombok log warn:StudyLombok(var1=111, var2=111)
+14:59:40.663 [main] WARN  com.alipay.StudyLombok - lombok log warn:StudyLombok(var1=10, var2=111)
+```
+
+
+### 注意事项
+1、最好不要使用@Data，因为涉及到继承关系时，用起来比较复杂。
+2、@NonNull : 如果给参数加个这个注解 参数为null会抛出空指针异常
+3、@Builder.Default设置建造者模式下的默认值
\ No newline at end of file
diff --git "a/study/log4j2\344\275\277\347\224\250\346\225\231\347\250\213.md" "b/study/log4j2\344\275\277\347\224\250\346\225\231\347\250\213.md"
new file mode 100644
index 0000000..09fe92c
--- /dev/null
+++ "b/study/log4j2\344\275\277\347\224\250\346\225\231\347\250\213.md"
@@ -0,0 +1,72 @@
+﻿#### log4j2使用教程
+
+
+官网：https://logging.apache.org/log4j/2.x/manual/configuration.html
+### 添加依赖
+```xml
+        <dependency>
+            <groupId>org.apache.logging.log4j</groupId>
+            <artifactId>log4j-api</artifactId>
+            <version>2.13.3</version>
+        </dependency>
+        <dependency>
+            <groupId>org.apache.logging.log4j</groupId>
+            <artifactId>log4j-core</artifactId>
+            <version>2.13.3</version>
+        </dependency>
+```
+
+### 典型的log4j2.xml
+```xml
+<?xml version="1.0" encoding="UTF-8"?>
+<Configuration status="WARN">
+
+    <!--输出类型-->
+    <Appenders>
+        <!--控制台输出-->
+        <Console name="Console" target="SYSTEM_OUT">
+            <PatternLayout pattern="%d{HH:mm:ss.SSS} [%t] %-5level %logger{36} - %msg%n"/>
+        </Console>
+
+        <!--文件输出-->
+        <RollingFile name="RollingFile" fileName="D:/logs/web.log"
+                     filePattern="logs/$${date:yyyy-MM}/web-%d{MM-dd-yyyy}-%i.log.gz">
+            <PatternLayout pattern="%d{yyyy-MM-dd 'at' HH:mm:ss z} %-5level %class{36} %L %M - %msg%xEx%n"/>
+            <SizeBasedTriggeringPolicy size="2MB"/>
+        </RollingFile>
+
+    </Appenders>
+
+    <!--只有配置这个才会输出-->
+    <Loggers>
+        <!--输出到控制台-->
+        <Root level="warn">
+            <AppenderRef ref="Console"/>
+        </Root>
+        <!--指定某个类输出到指定文件-->
+        <Logger name="logTest" level="warn" additivity="true">
+            <AppenderRef ref="RollingFile" />
+        </Logger>
+    </Loggers>
+
+</Configuration>
+```
+
+### 测试
+```java
+public class logTest {
+
+    private static final Logger logger = LogManager.getLogger(logTest.class);
+
+    public static void main(String[] args) {
+        logger.warn("测试日志输入warn");
+    }
+}
+```
+
+### 控制台输出
+13:57:41.569 [main] WARN  logTest - 测试日志输入warn
+### 文件输出D:/logs/web.log
+2020-08-02 at 13:57:41 CST WARN  logTest 9 main - 测试日志输入warn
+
+
diff --git a/test.txt b/test.txt
deleted file mode 100644
index ff2b784..0000000
--- a/test.txt
+++ /dev/null
@@ -1 +0,0 @@
-hellow i am branch develop
\ No newline at end of file