• Developing Quality Metadata

    Data and metadata drives the TV production and broadcast scheduling systems. Metadata helps to manage content and when you examine a broadcast infrastructure, a lot of what is happening is to do with the manipulation of nonvideo data formats. Interactive TV and digital text services also require data in large quantities. Managing it, cleaning it, routing it to the correct place at the right time and in the right format are all issues that are familiar to data management professionals inside and outside the media industry. While I wrote this book, I spent a little time working with some developers who build investment-banking systems. Interestingly they face identical problems to my colleagues in broadcasting. I suspected this was the case all along because I have frequently deployed solutions in broadcasting that I learned from projects in nonbroadcast industries. Spend some ‘sabbatical’ time in another kind of industry. It will teach you some useful insights. Workflow in an organization of any size will be composed of many discrete steps. Whether you work on your own or in a large enterprise, the processes are very similar. The scale of the organization just dictates the quantity. The quality needs to be maintained at the highest level in both cases. The Data and Metadata Workflow Tools you choose and use are critical to your success. The key word here is Tools. With good tools, you can “Push the Envelope” and raise your product quality. There has been much discussion about metadata systems and data warehouses. Systems used as data repositories are useful but if you don’t put good quality data in there you are just wasting your time. We need to focus on making sure the data is as good as possible—and stays that way. Raw data is often in somewhat of a mess. There are a series of steps required to clean the data so it can be used. Sometimes even the individual fields need to be broken down so that the meaning can be extracted. This book is not so much about storage systems but more about what gets stored in them. There are defensive coding techniques you can use as avoidance strategies. There are also implications when designing database schemas. Data entry introduces problems at the outset and needs to be as high quality as possible or the entire process is compromised. The book describes risk factors and illuminates them with real-world case examples and how they were neutralized. Planning your systems well and fixing problems before they happen is far cheaper than clearing up the mess afterwards. This book is designed to be practical. If nontechnical staff read it, they will understand why some of the architectural designs for their systems are hard to implement. For people in the implementation area, they will find insights that help solve some of the issues that confront them. A lot of the advice is in the form of case studies based on genuine experience of building workflows. Some explanation is given about the background to the problem and why it needs to be solved. The material is divided into two parts. Part 1 deals with theory while Part 2 provides many practical examples in the form of tutorials. We lay a foundation for further projects that look inside the media files and examine audio/video storage and the various tools that you can build for manipulating them. Before embarking on that, we need to manage a variety of data and metadata components and get that right first.

    0
    104
    7.43MB
    2011-03-17
    8
  • 数据仓库项目数据质量分析方法

    某高手原创的数据仓库项目数据质量分析方法,由于是结合具体项目写的,因此具有较强的实用性。

    5
    408
    216KB
    2011-03-17
    43
  • data quality assessment

    by Leo L. Pipino, Yang W. Lee, and Richard Y. Wang 一篇不错的关于数据质量的论问题

    0
    94
    1023KB
    2011-03-17
    9
关注 私信
上传资源赚积分or赚钱